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Preface 



The key component of a communication, radar, or sonar system is its receiver, whose 
purpose is to detect information-bearing signals, signals that have been weakened and 
distorted during transmission and corrupted by random noise. The systematic design 
of receivers and the assessment of their performance are based on signal-detection 
theory, and that is the subject of this book. 

The detection of weak signals in noise can instructively be viewed in the 
framework of the statistical testing of hypotheses, whose elementary foundations are 
presented in Chapter 1. These are applied in Chapter 2 to the basic problem of 
detecting a signal of known form in Gaussian noise. There we introduce the concepts 
of matched filtering and signal-space representation and determine the ultimate 
limits on signal detectability. We then treat the detection of narrowband signals of 
unknown carrier phase, setting up a convenient formulation for narrowband signals 
and noise as a natural generalization of the phasors used in analyzing alternating 
currents and voltages (Chapter 3). The structure of optimum and near-optimum 
receivers to which multiple independent inputs are available is examined in Chapter 
4. In this setting we provide a brief treatment of beamforming in transducer arrays. 
The performance of receivers is measured by their probabilities of error, and the 
efficient computation of such probabilities is treated at some length in Chapter 5. 

Statistical estimation theory is introduced in Chapter 6 in order to study the 
estimation of signal parameters such as arrival time and Doppler shift, which provide 
information about the location and speed of radar targets. Calculation of the mean- 
square errors in such estimates is emphasized. Maximum-likelihood detection, which 
draws on the estimation of signal parameters, figures in the design of receivers of 
radar signals of unknown arrival time and Doppler shift (Chapter 7). Chapter 8 



xv 



describes how estimation theory plays a role in detecting signals in the presence of 
noise of unknown strength. A broader uncertainty about the statistics of the noise 
calls for the use of nonparametric detectors, and we show how their performance can 
be assessed in terms of their asymptotic efficiency relative to a standard detector. For 
the sake of efficient scanning of the sky for radar targets, the designer should 
consider applying sequential processing, which is the topic of Chapter 9. The resolu- 
tion of signals that are close together in arrival time and carrier frequency is ap- 
proached in Chapter 10 through the estimation of signal parameters, and the efficacy 
of receivers that attempt to resolve close radar targets is treated in terms of the 
ambiguity function. 

Minimum-mean-square-error estimation of time series corrupted by noise is 
also briefly treated in Chapter 6 as preparation for the analysis in Chapter 1 1 of 
detectors of Gaussian stochastic signals and the calculation of their error prob- 
abilities. Here the Kalman-Bucy equations play an important role. The methods 
developed in this context figure in the treatment of the photocounting detection of 
optical signals in the last chapter. Photomultiplication and heterodyne detection of 
optical signals are also studied. Ten appendixes collect useful formulas and present 
calculations too tedious for inclusion in the main text. 

The reader of this book should be familiar with elementary concepts of prob- 
ability, such as conditional probability, distributions of random variables, expected 
values, and correlation. Numerous references to the widely available textbooks by 
Papoulis [Pap91] and the writer [Hel91] direct the reader to fuller treatments of 
certain details of probability theory when they arise in the course of our study. 
(Notations in brackets refer to the bibliography at the end of this book.) Some 
acquaintance with complex variables and Fourier and Laplace transforms is also 
presumed. The lecture notes on which this book is based were used by graduate 
students in a one-year course on detection theory. An array of problems will be found 
at the end of each chapter. 

For the sake of readers who wish to delve more deeply into the topics of this 
book, we have provided references to papers from which a search of the literature can 
begin. Our references are not necessarily to either the original work or the most recent 
work on a topic, and the omission of references in a particular context does not imply 
that the results presented are our own creation. For other approaches to signal 
detection and for the treatment of certain specialized aspects of the subject, the 
reader might consult the books by Van Trees [Van 68], Whalen [Wha71], Poor 
[P0088], Kassam [Kas88], and Kazakos and Papantoni-Kazakos [Kaz90]. 

Detection theory was the subject of an earlier work, Statistical Theory of Signal 
Detection (New York: Pergamon Press, Inc., 1st ed., 1960; 2nd ed., 1968). I am 
grateful to Pergamon Press for permission to use this material. 

Carl W. Helstrom 
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The Statistical Foundation 



f.f DECISION THEORY 

Signal-detection theory views a receiver as a device for making decisions about the 
composition of its input, which is typically the voltage across the terminals of an 
antenna A radar transmitter, for instance, sends out a short burst of electromag- 
netic energy at regular intervals, and the radar receiver must decide whether its input 
contains minute portions of that energy reflected from remote targets. A communi- 
cation system represents each message symbol by a signal of a particular form and 
periodically transmits one of its "alphabet" of signals to a distant receiver. The re- 
ceiver must synchronously decide which of the signals m the alphabet has appeared 
at its input, and the sequence of its choices constitutes the received message. 

The voltage across the terminals of the antenna continually vanes in an un- 
predictable manner because of random fluctuations m its ambient electromagnetic 
field These arc caused partially by chaotic thermal motions of ions and electrons in 
the field of view of the antenna and partially by interfering signals from power lines, 
communication systems, lightning, electrical machinery, and so on. The resulting 
fluctuations at the input to the receiver are called noise The signals about which the 
receiver must decide appear in the midst of this noise, which causes the decisions, 
however they are made, occasionally to be wrong. A radar receiver may mistakenly 
decide that an echo signal is present when there is none, or it may overlook one 
that is really there. A communications receiver may decide that the symbol F was 
transmitted when what was actually sent was an H . Detection theory considers how 
to design a receiver that suffers such errors as seldom as possible 



1 



As a random phenomenon, the combination of signals and noise must be 
described statistically and analyzed in the framework of the theory of probability. 
For definiteness let us consider a communication system dispatching messages written 
in an alphabet of M symbols. To each symbol corresponds a signal of a certain form, 
which is transmitted to the distant receiver and appears in its input attenuated, 
possibly distorted, and corrupted by random noise. The proposition that the y'th 
signal was transmitted is equivalent to a hypothesis about the composition of the 
input v(t) to the receiver during a certain interval of time; we denote this hypothesis 
by Hj, 1 <j < M. The receiver must choose one of these M hypotheses on the 
basis of its input v(t) during the observation interval, say < t < T. 

The input v(t) is a stochastic process, described in terms of probability density 
functions. For simplicity we suppose that the input during (0, T) has been appro- 
priately sampled and can be represented by n samples (v\,v 2 , ... , v„). We designate 
these data collectively by a vector v - (v\,v 2 , ... , v n ), and we represent v as a point 
in an ^-dimensional Cartesian space R„. Just how the sampling can conveniently be 
accomplished and how it can be expanded to include all the relevant information in 
v(t) will be treated later. 

Under hypothesis H Jt that is, when the y'th of the M signals has been transmit- 
ted, the n samples v are random variables having a joint probability density function 



Its form depends on the properties of the received signal and the noise. We remind 
the reader thai a probability density function such as pj(v) is a nonnegative function 
whose integral over the entire space R n equals 1: 



d"v = dv\ dv 2 ... dv„ is the volume element in the data space. The probability 
under hypothesis //, that the point v representing a particular set of samples lies in 
an arbitrary region A of that space is 



On the basis of the observed values of (v\, v 2 , ... , v„) the receiver is to decide 
among the M hypotheses H\, Hi, ... , H M - That is, it must choose which of the M 
probability density functions p,(v) it believes actually to characterize the input v{t) 
during (0, T). The scheme by which the receiver makes these choices is called a 
strategy. It must assign a definite selection among H\,H 2 , ... , H M to each possible 
set v of samples. The strategy can be visualized as a division of the space R„ of the 
data (i>i, v 2 , ... , v ti ) into M disjoint regions R\, R 2 , ... , Rm- When the point v falls 
into region R ; , the receiver chooses hypothesis H } , deciding that the y'th signal was 
transmitted. How can this decomposition of R n best be made? 



Pj(v) - pj(V\,V 2 , ... ,v n ). 





M. 
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The Statistical Foundation Chap. 1 



1.1.1 Bayes's Rule 



In a communication system it is desirable that the receiver make its decisions with as 
few errors as possible in the long run. Stated otherwise, the probability Q that the 
receiver makes correct decisions should be as large as possible. This probability Q 
depends on the structure of the regions R\, R 2 , ... , Rm, on the relative frequencies 
of the several signals, and on the set of probability density functions {pj(v)}. 

When signal j is transmitted, the probability that the receiver makes the correct 
decision equals the conditional probability under hypothesis H } that the point v falls 
into region Rj : 

Pr(- Hj\Hj) = Prfo e Rj\ Hj) = f p,(v) d n v. (1-1) 

Here — <■ H 3 denotes the event "hypothesis H } is chosen." 

Let lj = Pr(//,) be the relative frequency with which the symbol j appears in 
the messages and signal j is transmitted: 

M 

Ifc = «- 

We call t,j the prior probability of hypothesis H } . It must be known to the designer 
of the receiver. The overall probability of correct decision is then the weighted sum 

M M r 

The probability of error is P e - 1 - Q. 
Denote by 



M 



Pip) = £to(*0 d-3) 



the overall probability density function of the data v - (v\ , . . ,v„ ). The conditional 
or posterior probability that hypothesis H, is true when a particular set v of samples 
has been recorded is specified by Bayes's theorem as 

Pr(//,1 v) - \ <j <M, (1-4) 

p(v) 

[Hel91, p 90],' [Pap91,p.83] This follows from the basic definition of a conditional 
probability: for two events A and B, 

?i{A n B) Pi\A\ B)Pr(B) 



Pr(B\ A) 



Pr(A) Pr(A) 



[Hel91, p. 25], [Pap9I, p 27] For event B put "Hypothesis Hj is true," and for event 
A put "The data point lies in an infinitesimal region A about v" Px{A) = p{v)h; 
Px{A\ B) = Pj(v)A 

'A notation such as tins relets to the Bibliography at the end of the book 
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For any set v of data, one of the M posterior probabilities in (1-4), say 
Pr(/4| v), will in general be largest. If .the receiver then decides for hypothesis 
H k , always obeying the rule, "Choose that hypothesis with the greatest posterior 
probability, given v," it will attain the maximum probability Q of correct decision. 
We can symbolize this rule as 

{Pr(H k | v) > ?r(Hj \ v\ Vj±k)=*^H k . (1-5) 

It is known as Bayes's rule, for it was enounced by the Reverend Thomas Bayes in a 
paper, "An Essay toward Solving a Problem in the Doctrine of Chances," published 
posthumously in 1763 [Bay63j. 

It is not difficult to see that using Bayes's rule indeed maximizes the probability 
Q of correct decision. Let us use (1-4) to write (1-2) as 

M r 

<2 = ]T P{v) Pv(Hj | v)d"v. (1-6) 

In order that Q be as large as possible, those pomts v that are to be included in each 
particular region Rk are those for which Pr(//*| v) exceeds all the other posterior 
probabilities Pr(i?, ! v); any other assignment of points to region Rk would diminish 
Q. Points for which two or more of the posterior probabilities are equal lie on 
the boundaries of adjacent regions R } and, for the purpose of making a definite 
decision, can be assigned to either of them without altering the maximum value of 
the probability Q of correct decision. A receiver that thus maximizes the probability 
Q of correct decision, or minimizes the probability P e of error, is said to be optimum. 

In any given decision problem the prior probabilities £, of the hypotheses are 
fixed. What is new in each observation or trial is the set v of data, and it determines 
the decision through the values pj(v), 1 <j < M, of the M probability density func- 
tions at point v, which determine the posterior probabilities Prtfy j v) through (1-4). 
The primary task of the receiver is to generate those M numbers pj(v). Equivalently, 
it suffices for the receiver to produce M likelihood ratios 

Ajiv) = fM' 1 - J - M ' (! " 7) 

where p d (v) is any probability density function of the samples ip\, v 2 , ... , v„) that is 
nonzero at all points in R n where any probability density function pj{v) is nonzero. 
We can think of Pd(v) as the joint probability density function of the data v under a 
dummy hypothesis Hd. It is often convenient to take as representing the absence 
of any signals whatever, the input to the receiver consisting of noise alone. The 
dummy hypothesis Hj may or may not figure among the actually possible hypotheses 
Hi through Hm- 

Knowing the values of the M likelihood ratios Aj(v) permits determining which 
of the posterior probabilities Px(Hj\ v) is maximum, for 

m-\ 
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The Statistical Foundation Chap. 1 



According to (1-5), the optimum receiver calculates the M products £ ; A,(i>), 1 < j 
< M, and selects the hypothesis corresponding to the largest among them. The 
region R k of the data space is the set of points v such that 



The probability of correct decision is now 



v G R k 



(1-8) 



M m r 

g = 1 - = £ Pr(v G Rj\ H } ) = £ i jPl {v) d"v 

M f f 
= Y IjAj {v)p d {v) d n v = max \i k A k (v)] p d {v) d" v 

= £|max 5jtA A (?7)| 

where E indicates an expected value, here with respect to the distribution of the data 
v under the dummy hypothesis Hd that they are distributed with the probability 
density function p<tip). As we shall see m Chapter 2, the use of such likelihood 
ratios Aj(v) facilitates expanding the set v of samples to include all the information 
in the input v(t) relevant to choosing among the M hypotheses H\, Hi, ... , IIm in 
the optimum fashion. 

Example 1-1 Gaussian datum with M unequal expected values 

A single datum v has a Gaussian (or norma!) distribution under each of M hypotheses 
Hj t its probability density functions are 



M 



(1-9) 



pM = 



l 



exp 



(P-«;) 2 ] 
2 - 2 J 



J = 1.2, ..,M; 



(1-10) 



a 2 is its variance, and its expected value under hypothesis H s is 

E(v\ Hj)-aj 

[HeI91, p 81], [Pap91, p 74). Assume for simplicity that these expected values are 
arranged in ascending order, a\ <&<■•< a M On the basis of a single observation 
of the random variable v, we are to decide among these M hypotheses with minimum 
probability of crroi We shall see later that the design of a receivei m a coherent 
pulse-amplitude- modulated communication system leduces to a decision problem of 
this form The expected values a } correspond to the signal levels and the variance u 1 
to the strength of the noise 

Take the dummy hypothesis H,i as representing the absence of any signal at all 



p,i{v) - 



Then the likelihood ratios m (1-7) are 

v 2 (v - 



A,(u) = exp 



2a 2 



2o* \ XP [ 2g 2 \ 
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Assume that the M hypotheses are equally likely: l } = l/M. The optimum receiver 
selects that hypothesis for which £,A,(u) = AT 1 Aj(v) is maximum, that is, the hypothesis 
for which a y v — ^aj is largest. 

The data space is now the real line, and this decision rule corresponds to a division 
of the line into M segments R 3 separated by points midway between the expected values 
<V 

R x . -oo < v < + a 2 ), R 2 : \(a x + a 2 ) < v < \(a 2 + a 3 ), f 
R,: \(aj-i + a s ) < v < \{a } + a J+] \ .. , R M : \(a M -\ + a M ) < v < co. 

The probability when hypothesis H } is true that hypothesis H } is correctly chosen is, by 
(MO), 

Pr(— Hj\ Hj) = Vx{v £ Rj\Hj) 



p J (p)dv=~= «pk-j dx 



This holds also for j = 1 and J - M if we take a ~ -oo, a M+ \ = oo. 
Here erfc x is the error-function integral: 

1 /-CO 

erfcx = -j= J e~' 1/2 dt. (1-11) 

Advice about computing erfc jc is given in Appendix A of [Hel91]; tables are to be 
found there, in [Abr70, pp. 966-72], and in handbooks of probability and statistics. In 
parti culai 

eric (-oo) = 1, erfc = J, erfc co = 0. 

Because erfc i~x) = 1 - erfc x, we can write the probability of correct decision 
under hypothesis }], as 

Pr<- H J )=l- erfc(^3zL) _ erfc (5E±LZ^. 

The overall probability of correct decision is therefore, as in (1-2), 

and the probability of error is P e - I - Q When the expected values are uniformly 
spaced by 8, this reduces to an error probability of 



{£)• (i - i2) 

1.1.2 Minimizing Average Cost 



It was observed by Abraham Wald [Wal39] that in some situations in which choices 
must be made among statistical hypotheses, certain errors are more serious than 
others. It would then be more sensible to adopt a strategy minimizing not the overall 
probability of error, but the average cost of operation. The designer is presumed 
to know the cost C n incurred upon choosing hypothesis H, when hypothesis Hj is 

6 The Statistical Foundation Chap. 1 



true, ) < (t.f) < M These costs C n will depend both on the action attendant on 
each decision and on the true '"stale of natuie" when that action is taken. They can 
be assembled into a cost matrix C = 110,11. 

Suppose that I he set v of/? samples has been observed The conditional risk 
associated with choosing hypothesis //, is obtained by weighting the costs C n at- 
tending that choice by the postcnoi probabilities, given u, that the several hypotheses 
H, are true 

.1/ 

C(- fl,\v) = £ C„ Pr(/-/,| i>), (1-13) 
/-i 

the conditional probabilities Pi'{H,\ v) are given m (1-4) Let us again represent the 
strategy by a certain division of the data space R„ into M decision legions 7?, The 
cost of applying it, averaged over a long scries of trials, is 

„ M r M S1 r 

c = y p(v)C(^ v) d"v = y y £/C, mv) u-m) 

by (1-4). By the same argument as we used to establish Bayes's rule, we can convince 
ourselves that the strategy that minimizes the average cost C of operation is that for 
which each region Rj K contains those points v where the conditional risk C{—> Hk I v) 
is smaller than all the other conditional risks C(— * H, \ v) 

(C(- H k \v) < CH H,\v\ V/ + k\ =>- H k {[-15) 

The observer choose'; that hypothesis whose conditional risk C{— ► H/ x | v), given the 
data v, is least The strategy based on this rule is usually called the Baye.s stiaiegv 
When we introduce a dummy hypothesis H lf under which the data v have a 
joint probability density function p,i{v), these conditional risks can be expressed in 
terms of the likelihood ratios A,(v) defined in (1-7)' From (1-13) and (1-4), 

~ Piv) *- t Uv) 

where, by (1-3), 

p(v) 

L(v) = - y t,j\iAvl 

Again the receive! can base its decisions on the set of likelihood ratios A,(i>), 
/ = 1,2. . , M. 

The Bayes strategy is equivalent to Bayes's rule (1-5) when [he relative costs of 
all errors are the same. 

C„=\ A ' ' =A (1-16) 

with c > and A arbitrary, for then, by (1-13), 

C(— 1IA v) = A + c - <■ Pr(//,| v) 
is smallest when Pr(//,| v) is largest 
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Once a radar system has detected a target, the observer would like to know how 
far away it is and how fast it is approaching. The distance of the target determines 
the arrival time t of the echo signal, and its speed determines the Doppler shift w 
of the carrier frequency of the echo. Quantities such as these are parameters of the 
joint probability density function of the samples v of the input v{t) to the radar 
receiver, and the observer estimates them on the basis of those data v. How this is 
best accomplished will be treated in Chapter 6, where we shall see that parameter 
estimation can be considered as a choice among an infinite number of hypotheses 
and thus as a limiting case M -» oo of the JW-ary decision theory we have introduced 
here. 

Decision theory has been applied by statisticians in numerous contexts other 
than the design of receivers. The general theory is treated in such books as those 
by Blackwell and Girshick [Bla54], Luce and Raiffa [Luc57], Chernoff and Moses 
[Che59], Lehmann [Leh59], and Winkler [Win72]. The concepts of decision theory 
were applied to radar detection at the M.I.T. Radiation Laboratory during World 
War II [LawSO]. Communication receivers based on conditional probability were rec- 
ommended by Kotel'nikov in a dissertation written in Russia in 1947, published there 
in 1956, and translated into English in 1959 [Kot59]. In the meantime Woodward 
and Davies suggested applying conditional probability to signal detection [Woo50], 
[Woo53]. The analysis of detection problems in terms of statistical decision theory 
was developed by Middleton [Mid53] while the design of receivers on the basis of 
the likelihood ratio was being advanced by a group at the University of Michigan 
[Pet54]. Many problems in the detection of signals and the measurement of signal 
parameters have since then been studied by means of the theory of statistical deci- 
sions. William Root, who contributed much to detection theory, has written a broad 
survey of its development [Roo87]. 



1.2 BINARY DECISIONS 

1.2.1 Bayes Strategy 

In a radar system, viewed in the most elementary way, the task of the receiver is 
to decide whether the echo from some target is or is not present in its input v(t) 
observed during a certain interval (0, T). The receiver in effect chooses between two 
hypotheses: (Ho) "no signal is present and v(t) consists only of random noise," and 

"a signal or one of a specified class of signals is present in addition to the 
noise." The receiver is said to mala; a binary decision. Hypothesis Ho is commonly 
called the null hypothesis, H\ the alternative hypothesis. 

In most communication systems messages are coded into binary symbols such 
as and 1. For each a certain signal, say sq(i), is sent. Sometimes this signal 
is identically zero. For each 1 some other signal s\(t) is transmitted. Here too the 
receiver makes a binary decision, choosing between hypothesis Hq, "Signal so(t) has 
arrived in the midst of random noise," and hypothesis H\, "Signal s\(t) has arrived 
in the midst of noise." Because of the overwhelming importance of binary decisions, 
the greater portion of our study will be devoted to them. 
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In binary decisions the data .space R„ is divided into only two regions' Rq, 
containing sample points v inducing the choice ofhypothcsis Hq, and R\ t containing 
points v inducing the choice of H\. They arc separated by a hypers ur face called the 
decision surface and denoted by D The problem is to find the optimum location of 
this surface m the data space 

Denoting by C,, the cost attending the choice ofhypothcsis H, when H, is true 
(/, / = 0, 1), we can as in (1-15) express the Bayes strategy m a binary decision as 

{C(- H } I v) < C(- H \ v)} => - Hi 

or, by (1-13), 

{C lQ ?y(H \v) + C M ?r(Hi\v) < Cm PK^oM + On Pr(//i I v)} =» - H x 
Substituting from (1-4) and manipulating the resulting inequality, wc obtain the rule 

Pq(v) £i(C i - Cn) 3 

Here £o ~ P'"(#o) and £i = Pv(H\) are the prior probabilities of hypotheses Ho and 
H\, respectively, to + t\ = 1 The quantity A(u) = P\(v)/po{v) is the likelihood 
ratio appropriate for binary decision. The receiver computes A(v) from the values 
(v\, Vi, .. , v„) of the n samples of us input during (0, T) and compares it with 
the decision level Ao, which incorporates the costs and the prior probabilities. If 
A(v) < Ao, hypothesis Hq is selected, otherwise H\ 

Two kinds of error arise in binary decisions choosing hypothesis H\ when Hq 
is true, an "etror of the first kind," and choosing Ho when H\ is true, an "erroi 
of the second kind." In radar and sonar applications these arc called -a false alarm 
and a false dismissal, respectively The relative costs of the two kinds of error are 
Cm - Coo and Cot - Cii- When these are equal, Ao = £oAi, (1-17) is then equivalent 
to Bayes's rule, and the probability P e of error is minimized. 

The probability of an error of the first kind, or the false-alarm probability Qq, 

is 

£o = PrH //il Ho) = Pr(p G R\\ H ) = [ p (v) d n v, (1-18) 

J/f, 

where po(v) is the joint probability density function of the samples (v\, u?, . , v„) 
under hypothesis Hq 

The event complementary to an error of the second kind, that is, deciding that 
a signal is present when it is indeed at hand, is called a detection The probability 
Q,\ of detection is 

Q d = Pr(-> //,[//,) = Pr(w G RAH]) = [ P] (v)d"v, (1-19) 

where p\{v) is the joint probability density function of the samples (v\, t>2, . , v„) 
under hypothesis H\. The false-dismissal probability is Q\ = 1 - Q</. Statisticians 
call Qq the size and Q t \ the power of the binary hypothesis test In terms of these, 
the probability P e of error is 

P e = Mo +£iO - Qd) (1-20) 
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1 .2.2 Neyman-Pearson Criterion 

In many situations both the prior probabilities and the costs are difficult to estimate 
or perhaps even to define. This is especially true in signal-detection problems in 
radar, where it is hard to judge the cost of failing to detect a target and where 
the prior probability of a signal may not even be a meaningful concept. When 
hypothesis Hi is true extremely rarely, the principal factor in the average total cost 
is the fraction Q of trials in which hypothesis Hi is incorrectly chosen and an error 
of the first kind is made, whereupon some costly action is taken in vain. In a radar 
detection system, for instance, such a false alarm may lead to firing an expensive 
missile to attack a nonexistent target. Under such circumstances it is appropriate 
for the observer to determine an affordable value of the false-alarm probability Qq 
and to seek a decision strategy that attains this value and at the same time yields the 
minimum possible probability Qi = 1 - Q d of making an error of the second kind. 
This strategy, proposed by Neyman and Pearson [Ney33a], [Ney33b], is said to fulfill 
the Neyman-Pearson criterion. It corresponds in radar to maximizing the probability 
of detecting an echo signal while incurring a given false-alarm probability. 

Adopting the Neyman-Pearson criterion, as we shall see, also calls for a strat- 
egy based on the likelihood ratio and expressed by 

{ AW =SH— - (1 - 21) 

The only difference from (1-17) is that the decision level X is determined by the 
preassigned value of the false-alarm probability Qo> 

Po(A)dA, (1-22) 

where Pq(A) is the probability density function of the random variable A(v) under 
hypothesis Hq. [Remember that because the likelihood ratio A(v) is a function of 
the n random variables v\, vi, ••• , v n , it is itself a random variable.] The maximum 
value of the probability Qd of detection so attained is 

P\{A)dK (1-23) 

where P\{A) is the probability density function of A(u) under H\. 

In order to demonstrate that the rule in (1-21) is optimum, let us suppose that 
regions R and R] are separated by the decision surface D given by A{v) = \, with 
X chosen to satisfy (1-22). Then the probability of detection is 

Qd= f p\{v)d n v= f A(v) Po (v)d n v (1-24) 

JRi JR, 

by virtue of (1-21). Now, as shown for n - 2 in Fig. 1-1, we deform the decision 
surface D in such a way that the value of 

Go = £ Po(v)d»v 
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Figure l-l. Dclbimalion ol decision 
sin face 



remains the same. There are countless ways of doing so Call the new surface 
D' The points in region a that have been tiansferrcd from R\ to Rq have the 
same probability measure under hypothesis Hq as those in legion b that have been 
transferred from Rq to R\ If we measure the "value" of" a point by A(z>), we can 
say with respect to (1-24) that in thus altcimg region R\ we have exchanged points 
v of greater value A(v) —those m a for points of lesser value- those m /) As a 
result, the detection probability 0<i in (1-24) has decreased, while the falsc-aku in 
probability Qo has remained the same The dichotomy of the data space R„ into 
regions Rq and R] separated by the surface A(i»j = <\ must therefore be optimum 
under the Ney man-Pearson criterion. 

Symbolically, the change hQ ( i in the piobabilily of detection when we defoim 
the surface D into D' is 



Sfirf = | !>\(v)d"v- 



P\(v) d"v 



A(v)p Q (v)cl"v- 



where A(v) is the likelihood ratio defined in (1-21) In region a, A(v) > X, m legion 
/;, A(v) < \ Therefore, 



Mv) d"v, 
Po(v) el"v, 



A(v)p (v) d"v > X 

A(v)p () {v) d"v <\ 
and subtracting these we find 



S&/ < X p (v)d"v - \ 
Jb 



because 



Pi)(v) d"v = 



p {) (v) d"v - 
Pdv) d"v. 
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The detection probability Qd thus decreases when the decision surface is deformed 
in this way. 

Another way to derive the optimum strategy under the Neyman-Pearson cri- 
terion is to apply the calculus of variations. We want to minimize the probability 
Q\ = 1 - Qd of a false dismissal under the constraint that the false-alarm probabil- 
ity Q take on a preassigned value, say Q Q . Introducing the Lagrange multiplier \, 
we are to minimize 



by varying the position of the decision surface D. We can solve this problem by 
drawing on the results of Sec. 1.1.1. The average cost of operating a strategy that 
chooses between hypotheses Hq and H\ when their prior probabilities are £ and & 
(So + i\ - 1) and the costs are C l} is, as in (1-14), 



and set £o = £i = 5. The analysis leading to (1-17) shows that the decision surface 
minimizing the average cost C in (1-14) is given by the equation A(i>) = A = \; 
we simply substitute these costs and prior probabilities into the expression for A in 
(l-17) L The value of the Lagrange multiplier \ is then chosen to satisfy the constraint 



1.2.3 Operating Characteristic 

As the parameter X in (1-21) varies from to 00, the decision surface D moves 
through the space R„ of the data v, bounding for each value of \ the region Ri(K) 
of points v that lead to the choice of hypothesis Hi. The false-alarm probability 



c = Qx + HQo - fi ) 





<2o = Q - 




(1-26) 



and the detection probability 



p\(v) d"v 



(1-27) 
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Figure 1-2. Operating characteristic 
The dashed line is tangent to the op- 
crating characteristic at a point whose 
coordinates are Qo(M> &(M 



are in this way functions of X, as is indeed evident also from (1-22) and (1-23). By 
letting X range over (0, o°), we can plot the probability Q c /(\) of detection versus 
the probability {2o(X) of a false alarm; the resulting curve is called the operating 
characteristic of the receiver. It depends only on the probability density functions 
of the data v under the two hypotheses and not on any costs or prior probabilities. 
Figure 1-2 depicts a typical operating characteristic. The parameter X varies from 
X = at the upper terminus (1, 1) of the curve to X = °o at the lower terminus 
{0, 0). 

The slope of the operating characteristic at any point equals the value of the 
parameter X at that point: 

fr = x - (1 - 28 > 

ago 

Indeed, the optimum strategy must be such as to make C m (1-25) a stationary 
point with respect to variations m both the position of the decision surface D and 
the value of X. We find by differentiating (1-25) with respect to X and setting the 
result equal to zero that 

^- + x— + e -e -o. 

The value of X is chosen so that Qq = Q , and because Q\ = 1 - Qj, we obtain 
( i -28) immediately 

When the parameter X decreases from co to 0, the slope of the tangent to the 
operating characteristic decreases monotonely from the origin (0, 0) to the upper 
(cr minus (I, I), and the operating characteristic is therefore convex n. 
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In an alternative demonstration of (1-28), we consider two nearby decision 
surfaces in the data space R n . They are specified respectively by A(v) = X and 
A(v) = X + d\. Designate the space between them by dR\. Then 

Pr(X < A(v) < X + d\\ H x ) = P { {\)d\ = f P] (v) d"v. 

With dk infinitesimal, the lamina dR x is so thin that within it 

and therefore 

P^dX = f A(z>)/> (z>) = X f Po (v) d"v. 

Likewise, 

Pr(X < A(v) <\ + dX\ U ) = P (\)dk = f po(v) d n v. 
Dividing, we obtain 

W>= K 0<> - <ca - (i - 29) 

From (1-22) and (1-23), however, 

The slope of the operating characteristic is thus 

dQ d = dQ d /d\ = JPj(X) 
dQ /d\ P (k) 

as in (1-28). Equation (1-29) is useful, for it enables us to calculate the probability 
density function Pi (A) of the likelihood ratio under hypothesis H\ when we know 
its density function Po(A) under H , or vice versa: 

P\(A) = AP (A). (1-30) 

f .2.4 Sufficient Statistics 

The likelihood ratio A(v) = p\(v)/p (v) embodies all the information contained in 
the data v = (v ls v 2 , ... , v H ) that is relevant to deciding between hypotheses H and 
H\ in the optimum fashion. If someone else measures the data v, computes the 
likelihood ratio A(t>), and tells you only the result, you can attain the same minimum 
Bayes cost C mm by calculating the likelihood ratio 

, P\(A) 

- m (1 - 31) 

and comparing it with the decision level Ao given by (1-17). For this reason, 
A = A(u) is called a sufficient statistic. Any monotone function G = G(A) of the 
likelihood ratio will do as well. Without loss of generality we suppose this to be 
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an increasing function of A If G exceeds a certain decision leve! Go, hypothesis H\ 
is chosen, otherwise Ho. Under the Neyman-Pcarson criterion the value of Go is 
picked so that the false-alarm probability 



Q = p (G)dG (1-32) 

J Go 

takes on the preassigned value, where po(G) is the probability density function of G 
under hypothesis Hq. The probability Q d of detection is then 



j-CO 



Qd = 



P\(G)dG (1-33) 

Go 



with p\(G) the density function of G under hypothesis H\. The quantity G is a 
function of the data (v\, z?i, . , v„) and, like the likelihood ratio A(v), embodies all 
the information m them that contributes to making the decision in the best possible 
way under our two criteria. It too is a sufficient statistic. 

A likelihood ratio can be formed with the statistic G as well, 

A(G) = (1-34) 
Po{G) 

where po(G) and p\(G) are the probability density functions of G under the two 
hypotheses. Under the Bayes criterion the level Go with which the statistic G is to 
be compared is given by the equation 

A((?o) = P ^ G< ^ = A = ^ ClQ " C °°) (i„35) 
Po(Gd) - CnY 

The points on the operating characteristic can be indexed with the values of Go, and 
the parametric equations of that curve are 

Qd = Qj(Go\ Qo = 2o(G ), 

with these functions of Go given by (1-32) and (1-33) 

The sufficient statistic most commonly used is proportional to the logarithm 
of the likelihood ratio, 

When the probability density functions are jointly Gaussian, this statistic takes a 
particularly simple form. Furthermore, when data are taken m statistically indepen- 
dent batches, whatever their joint probability density functions, the logarithm of the 
likelihood ratio equals the sum of this statistic for each batch The value of g for 
the batch comprises all the relevant information in the batch and can be measured 
in its stead 

Example 1-2 n Gaussian data with unequal expected values 

As a simple example of these concepts, suppose that the quantities (v\, v 2i , v„) are 
statistically independent and have Gaussian distributions with variance cr. Let the 
expected value of each be a under hypothesis Hq and a x under H\, ao < ci\ Then joint 
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probability density functions are then products of Gaussian density functions of the 
form in (1-10), 

Pk {v) = (2w 2 r /2 exp|"- £ (Zf ' 2 ~J* )2 j, k = 0, 1, (1-37) 
(Hel91, p. 215]. The likelihood ratio is now 

The observer will choose hypothesis H when A(u) < A , where the value of A depends 
on the decision criterion used. Because the exponential function is monotone, the 
decision can just as well be based on the value of 

i * 



the sample mean of the observations, which is to be compared with the quantity V Q 
given by 

j, a + a] o 2 In A 
y + 

With a\ > ao, hypothesis H is chosen when V < V Q and Hi when V > K . If the 
Bayes criterion is being used, K depends on the costs and prior probabilities through 
A , which is given by (1-17). In this example the decision surface D is the hyperplane 

n 

X?, = nV Q . 

The sample mean V is thus a sufficient statistic in the sense just explained. It 
is a Gaussian random variable because it is a linear combination of Gaussian random 
variables [Hel91, p. 245], [Pap91, p. 197]. Its expected values under hypotheses H and 
H\ are a and o 5 , respectively, and its variance is a 2 /"- Its probability density functions 
under the two hypotheses are therefore 

and its likelihood ratio is 

which of course equals A(v) in (1-38). 

The probabilities of error of the first and second kinds are 

where erfc(-) is defined in (1-11). To use the Neyman-Pearson criterion, the observer 
chooses the value of K so that the probability Q Q of an error of the first kind takes on 
the preassigned value. 
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Example 1-3 Gaussian data with unequal variances 

Suppose the observer must decide which of two sources of Gaussian random noise 
is present m an input voltage, the one having a vanance equal to No, the other a 
variance equal to N| The expected value of the noise voltage is zero in both cases 
He measuies the voltage at n times far enough apart that the lesults (v\,Vi, ,v„) 
are statistically independent The two hypotheses between which he must choose are 
(Ha), "The variance of the voltage equals No," and (//[), "The vanance of the voltage 
equals N " We assume that N, > No, as when under hypothesis H t a noisehke signal 
is present in addition to the usual background noise, whose vanance equals No. The 
joint piobability density functions of the set of n measured voltages are given by 



p k (v) = (2TrNAr" /2 exp 



^ 2N k 



jfc = 0,1. 



The likelihood ratio for these measurements is 

W2 



A(z,) = [|]""exp J (iV - tff) £ 



(1-39) 



(1-40) 



The observer computes this likelihood ratio for the outcome of the experiment and 
compares it with a fixed quantity Ao If A(i>) < A , he decides that the vanance of the 
input voltage was No, otherwise that it was N\ 

The likelihood ratio m (1-40) depends on the data v only through the sum of 
squares 

S (1-41) 

1=1 

and the observer can base his decision just as well on the value of S, comparing it with 
a decision level So = r£ given by 

2 _ 2NqNi 
S ° " '» " AW£ ln 



if S < >£, he chooses hypothesis H 0l otherwise // ( The decision surface D is an n- 
dimensional hypersphere of radius ro, and the regions Rq and R\ into which the data 
space R„ is divided are, respectively, the interior and exterior of this hypersphere The 
sum of squares 5 is a sufficient statistic. 

The false-alarm probability is the probability under hypothesis Ho that the point 
v — (f[ , z>2, • ■ , v„) lies outside the hypersphere defined by 



and this probability is calculated by integrating the joint density function po(v) over the 
exterior R\ of the hypersphere- 



Q = Pi(S > Soi H (i ) = (2irN r" /2 f exp 



4 2N 



d"v. 



(1-42) 



The integrand is constant on hyperspherical shells, and the volume of such a shell of 
radius r and thickness dr is 



dV = 



r(«/2) J 



d>\ 
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[Edw73, p 339], where T{x) is the gamma function. [For positive integral values of x, 
V(x) = (x - 1)1 For half-integral values one can use the formulas T(x + 1) = xT(x), 
r(i) = tt 1/2 . Thus r(3/2) = i-ir" 2 , and so forth.] 
"The integral in (1-42) can be written 

&(*> = W«/ 2 [ r J" ,~. exp(-^) * 

-NUT 

which can be evaluated by Pearson's tables of the incomplete gamma function [Pea34]. 
Similarly, the probability of detection is 

a ( S o) = [r(|)]-'(;^H e -, rf „ 4 = ^ = 3* ^ 

The statistic S has a scaled chi-squared distribution under each hypothesis [Hel91, 
p. 220], [Pap91, p. 79]. 

By differentiating (1-43) and (1-44) with respect to S and dividing, we can verify 
the counterpart of (1-34): 

^ = Si = [fr«Ki(^-^)*] ™ 

for the density functions of the statistic S. This should be compared with (1-40). 

The data v\ , v 2> . . . , v„ might be processed in some other way. The receiver might, 
for instance, use instead of S the statistic 

s< = |>;. 

It would be compared with a decision level Sq set to yield a preassigned false-alarm 
probability Q . Calculating the false-alarm and detection probabilities for this new 
statistic in order to compare its performance with that of the one defined in (1-41) 
would be quite difficult. Our general theory assures us, however, that when the false- 
alarm probabilities are set equal, the detection probability attained by the sum S' of 
fourth powers of the data will be less than that attained by the sum S of their squares. 

f .2.5 Decisions Based on Discrete Random Variables 

In the receiver of an optical communication or radar system, the entrant light may 
be filtered and focused onto a photoelectrically emissive surface, and decisions about 
the presence or absence of a signal may be based on the number k of photoelectrons 
emitted during some observation interval (0 3 T). The number k, taking on only 
nonnegative integral values, is a discrete random variable that tends to be larger 
when a signal is present than when only background light enters the receiver. A 
more elaborate device might be designed to detect a radiant object by counting the 
numbers k\, & 2j ... , K of photoelectrons emitted by each of n detectors sensitive 
to light waves having different frequencies or arriving from different directions. In 
order to develop optimum strategies for such receivers, decision theory must take 
account of the discrete nature of data of this kind. The methods of Sees. 1.2.1 and 
1.2.2 are easily modified. 
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A decision is to be made between .two hypotheses H {) and H\ about a set of 
discrete random variables (x\, x 2 , ... , x n ) observed in some experiment. The proba- 
bility under hypothesis Ho that the data take on the set of values x = (xi, xt, ... , x„) 
is denoted by 

Po(xux 2 , ... , x„) = Pq(x); 
the probability under hypothesis H\ that these values are observed is 

Pi(x u x 2i ...,*„) = /» t (x) 

One calls ^o( x ) an ^ ^i(x) the joint probability mass functions of the data under the 
two hypotheses. When these probabilities are summed over ail possible values of 
each of the x's, the result is I. 

£ i>*(x) = l, fc = 0,l, 

(1 fe!91 , p. 203]. We can again think of (x\, xz, .. , x„) as the coordinates of a point x 
in ihc /? -dimensional Cartesian space R„, but now all the probability is concentrated 
<i! d countable set of points. The probability that the data point falls into a region 
A of the space H„ is the sum of the probabilities attached to those points x in A; 
symbolically, 

Pr(x e A| H k ) = £ P k (x), k = 0, 1. ' (1-46) 

A decision strategy again divides the space R„ into two regions R$ and R\; 
when the data point x falls into region R k , hypothesis Ha is selected, k - 0, 1. The 
.ivciage risk associated with this strategy is given by an expression like that in (1-14), 
except that the integrations over Rq and R\ are replaced by summations over the 
data points x in those regions Conditional risks can be defined as in (1-13) in terms 
of the posterior probabilities 

Pr(H,|x) = (1-47) 

where £o and l\ are the prior probabilities of hypotheses Ho and Hj, and 

P(x) = inPofr) + txPfc) (1-48) 

is ihc total probability of observing the set x of data. The average cost for a given 
shategy is then, as in (1-14), 

C = £ C(- H | x)P(x) + £ C(- H\\ xJ^Cx) 

\wih the conditional risks C(—* Hoi x) and C(— » Hjj x) defined as m (1-13) Again 
ihc average cost is minimum for the strategy that picks the hypothesis with the smaller 
conditional risk, and this implies as before that the optimum decision regions are 

tf = {x: A(x) < A }, Rx = {x- A(x) > A i, (1-49) 

where now the likelihood ratio 

>*> - m 
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is the quotient of the probabilities of the data x under the two hypotheses. The 
decision level Aq is again 

Ao " t.(Q.-c„)- (1 " 51) 

The optimum strategy is, as before, to compare the likelihood ratio A(x) with A 
and to choose hypothesis U\ if it exceeds Ao and hypothesis Hq otherwise. The 
false-alarm and detection probabilities are 

Q = £ P (x), Qd= X Pi(x) t (1-52) 

respectively. 

Example 1-4 Geometrically distributed datum 

Suppose that under each hypothesis the single datum x has a geometrical distribution 
of nonnegative integral values, 

Pq(x) ~ (1 - HoK> Pi(x)~(i- 1?, )pf , > zj 0( 

* = , = 0,1,2 <'- 53 > 

Wo + 1 Wi + 1 

where /wo and nt\ are the expected values of the datum under hypotheses Ho and H\, 
respectively. The likelihood ratio is 

A M = i^i (— V. (1-54) 

and hypothesis Hi is chosen if 



|ln(]5j) + lnA | 
*>[ \m J'* 

where Lwj denotes the greatest integer in a real number u. The false-alarm and detection 
probabilities are 

2o = (1 - v ) X *o = ^ 0+, ? Qd = ^ no+1 , (1-56) 

Ar=«o+l 

and the probability of error is 

= Sogo + tiO - Qd). 
This is minimum when Ao is set as in (1-51), with Qq - Coo - On — C u . 

The Bayes strategy for optimum decisions among more than two hypotheses 
can be developed in a similar manner and is again expressed by (1-5) or (1-15). In 
calculating the conditional probabilities Pr(/^| x), the probabilities P k (x) take the 
place of the probability density functions that figured in (1-4). 

Because the datum x in Example 1-4 takes on only integral values, the decision 
level with which it is compared is effectively an integer, and (1-56) shows that the 
false-alarm probability go can take on only one of a countable set of values. If the 
Neyman-Pearson criterion has been adopted, it may happen that the preassigned 
value Qo of the false-alarm probability is not a member of that set. In order to 
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achieve an arbitrary value of the false-alarm probability, it is necessary to resort to 
what is called randomization 

In Example 1-4 a randomized strategy prescribes choosing hypothesis H\ when- 
ever the integral datum .x exceeds a certain decision level no, and Hq when x < no; 
but when x equals « exactly, hypothesis H\ is chosen with a certain probability / 
and hypothesis Hq with probability 1 — / The false-alarm probability is then 

CO 

<2o= X P G (x) + J Po(n ). (1-57) 

rhe values of n and / are selected to yield the desired value of this probability. Let 
us write (1-57) as 

"0 

We sum the values of Pq(x), starting with x - 0, until the right side first becomes 
positive The value of x at which that occurs becomes the decision level no, and 
by dividing by Po(no), we obtain the value of the probability /. The probability of 
detection is then 

CO 

Qd= X P\W+J PM. (1-58) 

Figure 1-3 exhibits the operating characteristic of the randomized strategy for 
our example of a geometrically distributed datum. It is simple to construct such an 
operating characteristic The vertices of the polygon are the points at which j = 1 
and the decision level jumps from one integer hq to the next; m this example they are 
given by (1-56) for all nonnegative integers no. One plots these vertices and connects 
them with straight lines 

In general, the likelihood ratio A(x) - P\(x)/Po(x) takes on only a countable 
set of values X&, which we can arrange in increasing order 

< \] < K 2 < ■ < Xa-i < \ft < \a + i < ■ < °°- 

A number of sets x of data may yield the same value of A(x) The optimum ran- 
domized strategy under the Neyman-Pearson criterion can be described as follows 
The decision level is a particular one of those numbers, say kn When A(x) > X*, 
hypothesis H\ is chosen, and when A(x) < X^, hypothesis Hq is chosen; but when 
A(x) - Xa'> hypothesis H\ is chosen with probability / and Ho with probability 1 — f 
The false-alarm probability is 

00 

Qo=f Pr(A(x) = \ K \H )+ X Pr ( A < x ) = I H ^ U" 59 ) 

A-=A' + I 

and by the same technique as for (1-57), the values of / and K can be determined 
so that the right side of (1-59) equals the preassigned value of the false-alarm prob- 
ability. The probability of detection is then 

00 

Q, = f Pr(A(x) = K K \ HO + £ Pr(A(x) = \ A | //|). (1-60) 

k=K + \ 
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Figure 1-3. Operating characteristic: decision based on geometrically distributed 
datum, mo = 1. Curves are indexed with expected value m\ under hypothesis H\ . 

The chance device for making the decision whenever A(x) = \ K can contain 
a random-number generator that upon that occasion produces a random number 
with a uniform probability distribution over the interval (0, 1). If the random num- 
ber lies between and /, hypothesis H\ is chosen, otherwise Hq. In practice one 
will ordinarily accept a pair of slightly higher or lower false-alarm and detection 
probabilities in order to avoid the need for such a chance device. Randomization 
is useful mainly for theoretical purposes: one wants to evaluate different strategies 
or to plot detection probabilities under different conditions of signal intensity and 
background illumination, and it is awkward to make comparisons unless the false- 
alarm probability takes a common value throughout. When the data are discrete 
random variables, randomization is then necessary. 

In this chapter we have confined ourselves to decisions between simple hy- 
potheses, for which the probability density functions of the data contain no unknown 
parameters. When unknown parameters appear in the distributions, the hypotheses 
are called composite. In Chapter 3 we shall treat the problem of deciding between 
a simple and a composite hypothesis. Receiver design is essentially a matter of gen- 
erating likelihood ratios such as Aj(v) in (1-7), A(v) in (1-21), A(x) in (1-50), their 
equivalents, or— in complicated situations— suitable approximations thereto. The 
decision levels with which they are compared depend on whether the Bayes or the 
Neyman-Pearson criterion has been adopted, and this aspect can in general be left 
up to the user of the receiver. Once a design has been chosen, its performance should 
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be evaluated in terms of error probabilities or detection probabilities, and we shall 
i;ive some attention to the often difficult problem of calculating them. 



Problems 

1-1. Find the Bayes test to choose between the hypotheses H and whose prior prob- 
abilities are 0.6 and 4, respectively, when under Hq the datum x has the probability 
density function 

/ 2 \ 1/2 

po{x) = y exp(-|A- 3 )f/(A-), 
where £/(■) is the unit step function, 

f 0, x < 0, 
UW "k ,H °- 61) 
Under Hi x has the probability density function 

Pl (x) = e^U(x). 

(x is always positive.) Let the relative costs of the two kinds of errors be equal. Find 
the minimum attainable probability P e of error 
1-2. Under hypothesis Hi the random variable x has a uniform distribution over (-1, 1), 

Pl (x) = \, -1 < x < 1, p x {x) = 0, U-| > 1 

Under hypothesis Hq the distribution of x is triangular, 

po(x) = 1 - U'|, -1 < x < 1; p (x) = 0, \x\ > 1. 

We need to decide between H\ and Ho on the basis of the value of x m a single 
observation. The costs attending the decision are ■ 

Cio = 3, Q)i - 4, Coo ~ C\i — 0; 

C, t is the cost of choosing hypothesis H, when H, is true 

Find the Bayes strategy for choosing between H a and H\ for arbitrary prior 
probability { of Ho, < t, < 1, and calculate the minimum attainable Bayes cost C mm (Q 
Sketch the graph of that cost versus £ Determine the value of £ for which it is maximum. 

1-3. Consider the minimum Bayes cost C min in a binary decision as a function of the prior 
probability £o = C of hypothesis Ho Use (1-14) to write C mm (0 in terms of the false- 
alarm probability Qo and the detection probability Q,i, which are now functions of £ 
through their dependence on \ - A n as given in (1-17) Use (1-28) to calculate the slope 
dCmm/di of this function Show that the function C mm (t) is convex n in < C < 1 

If the costs Qj are known, but not the prior probabilities £o ~ £i - 1 — i, the 
most conservative strategy is the Bayes strategy set up for that value of £ at which the 
minimum Bayes cost C mm (0 is greatest. It is called the mimmax strategy, Problem 1-2 
furnishes an example Determine a linear relation between gn(€) and from which 

the mmimax value of £ can be calculated Use the operating characteristic to develop 
a graphical method for finding this value 

1-4. In another approach to the proof that the function C mill (£) introduced in Problem 1-3 
is convex n, suppose that the Bayes strategy has been set up under the assumption that 
the prior probability of hypothesis Ho equals £ = tj, but the actual relative frequency 
of that hypothesis equals £. In terms of the elements of the cost matrix C, write down 
the associated cost of operating the strategy as a function of £ It will depend on the 
false-alarm probability Qo(£') and the detection probability <?,/(£') characterizing the 
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Bayes strategy adopted. This function will be linear in £ and represented by a straight 
line in a graph of cost versus priorprobability. Explain why that straight line must 
he above the curve of the function C min (Q representing the average cost of the Bayes 
strategy, except at the point £ = From that fact deduce the convexity of the latter 
function. 

1-5. Under hypothesis Ho a datum x has the probability density function 

> 2 - x z ), \x\ < a, 



\x\>a, 

and under hypothesis H { its probability density function is 

(,(6 3 -x 1 ), \x\ < b, 



-{; 



= > - ... 

1*1 > b, 

with b > a. Calculate the constants Aq and A\ . Determine the optimum strategy under 
the Neyman-Pearson criterion for deciding between the two hypotheses, and calculate 
the false-alarm and detection probabilities for it. With b -2a, sketch the operating 
characteristic for this optimum strategy. 

1-6. The random variables x and y are Gaussian with expected value and variance 1. 
Their covariance Cov(x,y) may be either or some known positive value r > 0. Show 
that the best choice between these possibilities on the basis of a measurement of x 
and y depends on where the point (jc, y) lies with respect to a certain hyperbola in the 
(x,y )-plane. Hint: Under hypothesis H\ the joint probability density function of x 
and y is 



. . 1 f x 2 + y 2 - 2rxy 1 



under hypothesis H Q it has the same form, except that r = [Hel91, p. 160], [Pap9I, 
p. 127]. 

1-7. A random variable x is distributed according to a Cauchy distribution, 

m 



TT(m 2 + x 2 ) 



The parameter m can take on either of two values mo and m t , mo < rti\. Design a 
statistical test to decide on the basis of a single measurement of jc between the two 
hypotheses H Q (m = mo) and H\ (m = m{). Use the Neyman-Pearson criterion. For 
this test calculate the power Q d - 1 - as a function of the size Q . 
1-8. A choice is to be made between hypotheses H and Hi on the basis of a single mea- 
surement of a quantity x. Under hypothesis H^, x - n; under H if x = s + n. Here 
both s and n are independent positive random variables with the probability density 
functions 

pin) = b e-^ U(n), p(s) = c e~ cs U(s). 

Calculate the probability density functions of the datum jc under each hypothesis. Find 
the decision level on x to yield a given false-alarm probability Q , and calculate the 
probability Q<t of correctly choosing hypothesis H t . 

1-9. Given are M independent data v ~ (v u v 2 , ... , v M ). Under hypothesis Ha each has a 
bilateral exponential, or "Laplace" distribution, 

p (v) = ic-W.. 
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Under hypothesis H\ the distribution of v is a shifted version of this, p\(v) = p (v - s), 
s > Determine a sufficient statistic for deciding between these two hypotheses under 
the Neyman-Pearson criterion 

J-10. For a single datum v having the same distributions under hypotheses Ha and H\ as m 
Problem 1-9, determine the likelihood ratio hip) and sketch it as a function of v For 
all positive values of Ao m (1-17), determine the regions Rq and R x on the z?-axis m 
which hypotheses Ho and H\ are lespectively chosen 

1-11. Under hypothesis H Q the datum v has the Cauchy probability density function 

j 



tt(i + v 2 y 

under hypothesis H x it has the displaced Cauchy density function 

MV) = -,)»] ■ S > 

Sketch the likelihood ratio A(v) as a function of v. For all positive values of Ao in 
(1-17), determine the regions R and R\ on the y-axis in which hypotheses H and H\ 
are respectively chosen. 

1-J2. For the logarithm g of the likelihood ratio, as defined by (1-36), 

= Ate) = ^, 

define the moment generating functions of g under hypotheses Ho and H\ by 

fc S ) = E{e~*\ H } ), y=0, 1, 

[Hel9Up 276], [Pap9I, p [15] Show that = J (s - 1) Determine Ms) and /, (s) 
for the logarithms of the likelihood ratios m Examples 1-2 and 1-3 in Sec. 1-2 

1-13. Assume that one has calculated the minimum eiror probability P C (Q attainable in the 
choice between hypotheses Hq and H u as a function of the prior probability £ = £ of 
hypothesis H^ Show that the false-alarm and detection probabilities are then given by 

Q (0 = P e + (1 - QAO =1-P<+ 

1-14. The datum x is a nonnegative integer with Poisson probabilities under hypotheses H 
and H^ 

Pk(x) = — exp(-rtii), x = 0, I, 2, . , k =0,1, 

-V 1 

[He)91, p. 43], [Pap91, p 76] In teirns of the expected values m n and W|, the decision 
level «(,, and the fraction /, determine the false-alarm and detection probabilities of 
the optimum randomized strategy for deciding between Ho and H i under the Neyman- 
Pearson criterion, as described m Sec 1 2 5. Show how to calculate both the decision 
level » on x and the probability / with which H\ is chosen when x - n Taking 
m - 1 and m\ - 3, draw the operating characteristic, and assuming the relative error 
costs to be equal, draw the curve of the minimum attainable probability P e (Q of error 
as a function of the prior probability £ = £ of hypothesis Ho 

1-15. A sequence of n independent measui ements is taken of a Poisson-distributed variable 
v whose expected value is m under hypothesis H {) and m\ under hypothesis H\, as 
in Problem 1-14 On what combination of the measuiements should a Bayes test be 
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based, and with what decision level should its outcome be compared, for given prior 
probabilities £ and ft of the two hypotheses and a given cost matrix C? 
1-16. Prove from (1-59) and (1-60) that for the optimum randomized strategy with discrete 
data the slope of the operating characteristic equals the decision level X on the likelihood 
ratio, as in (1-28). 

1-17. Under hypothesis H the datum x is uniformly distributed over (0, 1). Under H\ it is 
uniformly distributed over (a, a + 1), with < a < 1. Determine the optimum strategy 
under the Neyman-Pearson criterion for choosing between H and H t . Observe that as 
the likelihood ratio now takes on only a finite number of possible values, randomization 
is necessary. Calculate the operating characteristic for this hypothesis test, that is, the 
graph of the detection probability Qd versus the false-alarm probability Qo, and sketch 
it. 

1-18. The decision about whether a certain optical signal is present or not is based on the 
numbers n x and n 2 of photoelectrons emitted during an observation interval (0, T) from 
two separate photoelectrical^ emissive surfaces onto which the light from the source, 
along with background light, will fall. Under hypothesis Ho, "signal absent," these 
numbers have independent Poisson distributions, 

Pr(n, = k u «2 = k 2 \ H<>) = ^TT exp(-m , - m 02 ), 
and under hypothesis H\> "signal present," their joint probability mass function is 

k] k 2 

Pr(«i - k u n 2 = k 2 \ H\) = " ™ 12 exp(-mn - m i2 ), mu > m i, m l2 > mo 2 . 

fc][ K 2 \ 

Show how to process the data n t and n 2 to yield the maximum probability Q d of 
detection for fixed false-alarm probability Qq. Explain in detail how to calculate that 
maximum detection probability Q^. 
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2 



Detection of a 
Known Signal 



2.1 GAUSSIAN NOISE 

With the theory of statistical tests introduced in Chapter 1 , we can attack the simplest 
.signal-detection problem, that of deciding whether a signal s(t) of specified form 
has arrived at a definite time in the midst of Gaussian noise. Gaussian noise is 
ubiquitous. It originates in the thermal fluctuations of all matter in the universe, 
which create randomly varying electromagnetic fields that excite the antenna of the 
receiver and generate a fluctuating voltage nif) between its terminals. The thermal 
fluctuations of the ions and electrons in the input resistor connected across those 
terminals also contribute to this thermal noise. As the sum of the miniscule effects 
of an enormous number of randomly moving charges, the noise «(/) is a Gaussian 
stochastic process by virtue of the central-limit theorem [Hel91, pp. 260-5], [Pap91, 
p. 214]. Other types of noise, such as clutter in radar and reverberation in sonar, 
which can also be modeled as Gaussian random processes, are sometimes present as 
well. The detection of a known signal in Gaussian noise is a fundamental problem 
of signal-detection theory. 

The input v(t) to the receiver, which can be taken as the voltage between the 
terminals of its antenna, is measured during an observation interval < t < T. On 
the basis of this input the receiver must choose one of two hypotheses: (Hq) there is 
no signal present, and the input consists only of Gaussian noise with expected value 
zero, v(t) = «(/); or (H\) the input is the sum of the expected signal and the noise, 
c(/ ) = .¥(/) + n{t). The receiver embodies a criterion by which its success in a large 
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number of decisions of this kind can be evaluated; as discussed in Chapter 1, this 
criterion will influence how it processes the data. For example, the signal might be 
a rectangular pulse of duration V < T: 



It occurs at a definite time within the observation interval. A communication system 
might be using such pulses to convey a message that has been translated into a 
binary code with symbols and 1. Every T seconds a pulse is or is not sent, 
depending on whether the current message symbol is a 1 or a 0. At the end of each 
interval of T seconds, the receiver decides which of the symbols was transmitted. 
Because of the noise it will occasionally err, and the designer's aim may be to 
minimize the probability of its doing so, errors in the two symbols having been judged 
equally expensive. The decision criterion is then the Bayes, with equal relative costs, 
Cio - Coo = C 0i - Cn; the relative frequencies £o and £i with which the transmitter 
sends the symbols and 1 are known. Alternatively, the Neyman-Pearson criterion 
may be adopted and the probability of detection maximized for a preassigned value 
of the false-alarm probability. 

In order to determine the likelihood ratio on which, according to what we 
learned in Chapter 1, the receiver will base its decisions, we must set up an appro- 
priate method of sampling the input v(t) and then write down the joint probability 
density functions of the samples. To this end, we begin by reviewing the properties 
of Gaussian noise, forgetting the signal s(t) for the present and concentrating on the 
probabilistic description of the noise. Out of its probability distributions we shall in 
Sec. 2.2 form the likelihood ratio, pass to the limit of an infinite number of samples, 
and show how the resulting receiver structure can be realized by a certain linear 
filter. Then we shall calculate the error probabilities characterizing the performance 
of the optimum receiver. 

2,1,1 The Density Functions of Gaussian Noise 

The input v(t) is a stochastic process that we can assume continues throughout an 
infinite interval (-oo, oo). It is defined through the array of all joint probability 
density functions 



of its samples v\ ~ v{t\\ v 2 = v{t 2 \ v„ = v(t n ) taken at an arbitrary number 
n of arbitrary times t u t 2 , ... , t„. These functions p(v u v 2s ... , v„) have the basic 
properties of joint probability density functions mentioned at the beginning of Sec, 
1.1. When as now the noise is Gaussian, the samples x>i, v 2 , ... , v„ are Gaussian 
or normally distributed random variables; that is, their joint probability density 
functions have the form 



sit) = A, 
s(t) m 0, 



< t x < t < h + 7" < T, 
t <t\ y t > ?! + T'. 



p(v u v 2 , ...,v n ) 



P(vi,v 2 , ... y v„) = M„ exp 




(2-1) 
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The quantities |xj^ form an n x n matrix that we designate by p.,, . The normalization 
constant M„ is 

M„ =(2<nr n/2 \det i L„\ l/2 , (2-2) 

where "det" stands for the determinant [Hel91, p. 241], [Pap91, p. 197}. 
The expected values of all samples of the noise are zero: 

E[v(t)\ H ] s 0. 

In particular, the first-order probability density function of a single sample v = v{t) 
of the noise is 



P(P) = 



1 



where of = Var v(t) is the variance of the noise at time /. 

The matrix appearing in (2-1) and (2-2) is the inverse 



of the symmetric n x n matrix <{>« whose elements 4^ are the covariances of the 
samples Vj = v(tj) and Vk - v(tk), 

fyk = by - Cov(vj, Vk) = E[v(tj)vUh)\ Ho]. 

These matrix elements are determined by the autocovariance function 

W,s) = E[v(t)v(s)\ H ] (2-3) 

of the noise [Hel91, p. 363], [Pap9I, p. 289]. In particular, the variance at time / is 

of = <K', 0- (2-4) 

The bivariate density function of a pair of samples v\ - v(t\), 1)2 - v{t2) of the 
noise, for instance, depends on the 2 x 2 covariance matrix 



r<t>n 4»12 1 _ [ <rf raiaj 1 

L <|>2| 4>22 J L r<f\02 02 y 



where (rjr. = Var v(tk), k ~ 1, 2, and r = r(/|, /s) - fyn/^/fynfai is trie correlation 
t octVicicnt of the samples. The inverse of this covariance matrix is 



ofa^l - r 2 ) L -raio- 2 a? J 



^2 



i \ factor in front of any matrix multiplies each of its elements.) Then (2-1) with 
2 becomes 



p(v\, v 2 ) = 



2iro-|CT2Vl — i' 2 



2n>\ v. 



2(1 - r 2 ) 
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A convenient notation collects the samples (vu vz, ... , v„) into a column vector 
z?, whose transposed row vector is denoted by v T , and it permits writing the joint 
probability density function of these samples concisely as 

p(v) = (2^ 2 \dct^ 2 cx V (-{v T ^v). (2-5) 

We shall make frequent use of such matrix notation hereafter. 

2.1.2 Stationary Noise 

When the noise is stationary, its autocovariance function <f>(*i, '2) depends on the 
times t\ and t 2 only through the interval t = t 2 ~ t\ between them, and it is written 
<K*2 - This autocovariance function is an even function 

+(t) = 4>(-t), 

and 4>(0) = cr 2 is the variance, now constant, of the noise. The autocovariance 
function is the Fourier transform 

*(T) = |%(o)) e '" UT ^ (2-6) 

of a nonnegative function 3>(a>) known as the spectral density of the noise: 

*(<o) = 4»(t) e~ im dT t *(«) = 3>(-w) > 0, (2-7) 
J— 00 

[Hel91, pp. 383] 1 . The variance of the noise, in particular, is the integral of the 
spectral density, 

° 2 = <K0) = J *(«)^- (2-8) 

The average power in the spectral components of the process lying between positive 
frequencies a)/2ir and (o> + du^/ltr (Hz) equals 

[*(«) + *(-o,)]^ = 2$(o>)^. 

2n 2ir 

It is measured by a spectrum analyzer [Hel91, pp. 433-9], [Pap91, pp. 438-9]. 

When stationary Gaussian, noise v(t) passes through a stationary linear filter 
whose impulse response is k(r), the output 

v (t) = k(T)v(t ~T)dT 
h 

is a stationary Gaussian random process whose spectral density is 

<I>o<a>) = |^( w )| 2 $( w ), (2-9) 

where 

v(<o)= f k^e'^dt (2-10) 
Jo 

is the transfer function of the filter [Hel91, p. 384], [Pap91, p. 324]. 

1 Papotilis [Pap91 , p. 31 9] defines the spectral density as the Fourier transform of the autocorrelation 
function. When as here the process v{t) has expected value zero, this distinction is of no consequence. 
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2. 1.3 Sampling by Qrthonormal Functions 



In determining optimum strategies for decisions about Gaussian random processes, 
;is for the detection of known signals in Gaussian noise, it is convenient to sample 
nmdom processes such as v(t) not at particular instants of time, as in Sec. 2.1. 1, but 
by means of their expansions in a particular kind of Fourier series. Let a random 
process v{t) be observed during an interval (0, T). Define an infinite set of functions 
fk(t) orthonormal over that interval in the sense that 

■T 



f IMOfdt s 1, 
Jo 



to 

-T 



k = 1,2,3,..., 
fj{t)f k {t)dt s0, j # A. 

Jo 

The most familiar set of orthonormal functions is made up of sines and cosines: 

A im! tu-r set is obtained by shifting and scaling the Legendre polynomials: 

/ t (0 = (^)' /2 ft-,(|-l), fc = 1.2,3 

There is no limit to the number of sets of orthonormal functions that can 
f><- constructed, and in a particular problem, one set may be more convenient than 
.mother. Indeed, given any infinitely numerous set of linearly independent functions 
■u<f l #2(0, ■■■ . one can construct from them a set of functions {fk(t)} orthonormal 
<>ut Ihe interval (0, 7*) by using what is known as the Gram-Schmidt procedure. It 
will be explained in Sec. 2.1.4. 

Wc write the random process t;(0 as a Fourier series, 

CO 

v{t) = £ y t /U0, (2-12) 
;ind as its "samples" we take the coefficients 

rT 

v k ~ f k {s)v{s)ds. (2-13) 
Jo 

Wc assume that the set of functions f k (t) is sufficiently numerous so that any real- 
ization v(t) of the random process can be represented as in (2-12); it is said to be 
■ omplete. Substituting (2-33) into (2-12) shows that 

00 

5>(0M?) = »(/-.*), (2-14) 
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where 8(r - s) is the Dirac delta function, which is so defined that for any function 
h{t) 

fOO 

h(s)h(t -s)ds = h(t), 



provided A(-) is continuous at /. Equation (2-14) is known as the completeness 
relation for the set of orthonormal functions. 

When v(t) is purely random noise, its expected value is zero at all times /, and 
hence all coefficients have zero expected values: 

E(v k \H ) = 0, Vk. (2-15) 

The covariances of the samples are, by (2-3), 



4>* = E{ Vj v k \ H ) = T I* fj(t)f k (s)E[v{t)v(s)\ H ] dt ds 
Jo Jo 



\ T [ T f j (t)${t > s)f k {s)dtd S 
Jo Jo 



(2-16) 



in terms of the autocovariance function 4>(?, of the noise. 

When v(t) is a Gaussian random process, the samples v k defined as in (2-13) 
are Gaussian random variables, the joint probability density function of any n of 
which is given by an expression like (2-1) or, in vector notation, (2-5). The reason 
is that any linear combination of Gaussian random variables is a Gaussian random 
variable [Hel91, p. 245], [Pap91, p. 197], and the quantities v k defined by (2-13) are 
linear combinations of the values of v(t) at all times ; in (0, T). 

The sample v k can be generated by passing the random process v(t) through 
a linear filter whose impulse response is 

'•Hr- 1 :r,' T - 

and measuring the output of the filter at time / = T. The input v(t) having been 
turned on at time t = 0, the output is 



= l'h k (T)v(t 
Jo 



Wk(0 = I h k (T)v(t ~T)d?T, 

and at time t - T, by (2-17), 

MT) = Cfk(T - i)v{T - t) d-r = f T f k (u)v(u) du = v k . 
Jo Jo 

The filter whose impulse response is given by (2-17) is said to be matched to the 
signal f k (t) over the interval (0, T). Matched filters will turn out to be most useful 
in constructing optimum detectors. By passing the input v{t) through a bank of 
filters matched to the "signals"/i(0> hit), Mt), ... in our orthonormal set, we could 
generate as many of the samples z>i,t>2, *>3, ••• as we liked. As we shall see, the 
optimum processing of the input v{t) will not require us to do so. 

The expected value of the sum of the squares of all our samples equals the 
average total energy of the noise v(t) received during the interval (0, T), for by 
(2-12) and the orthonormality relation (2-11) 
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./• -7 CO CO 

[v{t)fdt = <K/,/)^ = XI 

CO OS CO 

I ere we have used the Kronecker delta, 



rT 

fkU)f m (t)di 

(2-18) 



8 - = 1n w'"' (2 " 19) 
(0, k ± m. 

If the noise is stationary, the quantity in (2-18) equals T times the variance cr of 
the noise as given in (2-8). 

2.1.4 Gram-Schmidt Orthogonaiization 

The Gram- Schmidt orthogonaiization procedure enables us to start with a set of 
linearly independent functions g\{t), giO), ■■■ > and from them to construct a set of 
functions f\(t), fiiO, ... that are orthonormal in the sense of (2-1 1). It rests on the 
idea thai functions of time t in (0, T) can be thought of as vectors in a Cartesian 
space of infinite dimensionality. 

What corresponds to the scalar product of any two functions h(t) and m(t) 
will be denoted by 



rT 



(h, m) = (m, h) = 



h(t)m(t)dt. (2-20) 



The "length" of a function h(t), or of the vector representing it in the Cartesian 
space, is (h, ft)" 2 . In terms of this notation, the orthonormality conditions (2-1 1) are 

(Ct.f«) = 5*» = {!/ (2 ' 21) 
The first element J\ (/) of the new set of functions is taken proportional to g\(t): 



MO = 0»*i(O, Pi =(gJ,gir i/2 = 



r 



The constant ( serves to normalize the function f\{t) to unit length. 
The second function will be a linear combination of/j(/) and gi(t\ 

flit) = a/,(0 + p2ft(0, 

in which the constants a and fBs remain to be chosen. Thinking of the functions 
g\ (/) and g2(t) as vectors,, we see that they define a plane. The new function f\(t) lies 
along g\(t), and the new function /2(f) lies in the same plane and is perpendicular 
to/i(0- Taking the scalar product with f\(t\ we obtain 

(f,,f 2 ) =a(f,,fO + P 2 (f|,g2)-0, 

■■-. hence a = , £2)- Furthermore, 
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1 = (f 2 , f 2 ) = 0(f 2 , f,) + (3 2 (f 2 ,g2) = fcft, 82) 
= P2[a(fl,g 2 ) + P 2 (g2,g2)] 
= ^[(g2,g 2 )-(fl,g 2 ) 2 ]. 

Hence 3 2 = [(g 2 , g 2 ) - (ft, g 2 ) 2 ] _1/2 , and 

./2(0 = 32[ft(0"(fhfc)/l(/)]. 

As we continue through this Gram-Schmidt procedure, each new function ./*(/) 
is a linear combination of g k (t), which has not yet been used, and of the orthonormal 
functions f t {t) previously formed, 1 < i < k - 1. If for some fa we set 



fk(0 = fa 



£*(0-£(f/,g*)/(oj, (2-22). 



the function /*(?) will be orthogonal to the previous members fj(t) of the set, 
j = 1, 2, ... , k - 1, for by (2-21) 

ft, h) = fa |ft , b) ~ X (f/, fe)8» J = 0. 

The constant fa in (2-22) is determined so that (f*, f*) = 1; that is, by (2-21) and 
(2-22), 

0*. U) = P*(g*. f*) = tf k b) - X ( f » fcX&. = I, 

and the normalization constant in (2-22) is 

-,-1/2 



ftt = jfe*,g<0~|>,g,0 2 



In this way we can create from an infinite set of linearly independent functions gj(t) 
as many members f k {t) of a set of orthonormal functions as we need. 

2.1.5 White Noise 

The thermal noise that, as we said at the beginning of this section,^ ever present at 
the input to a receiver possesses a spectral density much broader than the spectra of 
any signals one has occasion to detect, and even much broader than the passband 
of the input circuitry — antenna, leads or waveguides, and so on — that conducts the 
signals into the receiver. It is customary to model this thermal noise as a stationary 
random process v(t) whose spectral density is uniform, 

$(<o) = ~, (2-23) 

over a range of angular frequencies -o>o < <o < w encompassing the spectra of 
the signals to be detected and extending far beyond them. The quantity N is the 
unilateral spectral density of the noise: the average noise power passing through a 
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filler of unit gain and bandwidth At - Aio/2ir Hz, including components of both 
positive and negative frequencies, equals NAv. [The quantity N/2 in (2-23) is called 
i lie hilaferal spectral density; the noise is thought of as equally divided between 
positive and negative frequencies.] Because of the uniformity of its spectral density, 
this kind of noise is called white noise [Hel91, pp. 403-4], [Pap9I, p. 295]. 
According to (2-6) the autocovariance function of white noise is 

4>(t) = ~8(t). (2-24) 
I his Dirac delta function S(t) can be thought of as a peaked function of unit area, 

8(t)</t = 1, 



uhnse duration is much shorter than any time interval our instruments can resolve. 

Because (2-4) with (2-8) and (2-23) indicates that time samples of white noise 
u i Mild have infinite variance, temporal sampling such as we started with in Sec. 2.2.1 
r. unsuitable. Instead we sample white noise by means of an arbitrary complete set of 
-ii ihonormat functions {fk(t)} as described in Sec. 2.1.3, and we find by (2-16) that 
ihr samples Vk are uncorrected and hence statistically independent Gaussian random 
\:niab!es with variances equal to N/2\ the elements of their covariance matrix are 



4>/A - J^ T fj(tm-s)Ms)dt CIS 

N C T N 



(2-25) 



• joint probability density function of any set v\, v 2 , ... , v„ of n of these samples 
he re I ore 



(2-26) 

1 v 2 



N 

k = \ 



= (nN)-" /2 exp 

White noise can instructively be pictured as a dense succession of sharp pulses 

■ '.■ uning at random times t,„ and having independent and identically distributed 
i.nulum amplitudes a m : 

CO 

v({) = X a "' b( -' " T '«)- (2 " 27) 

II !=-CO 

I "i ihe sake of definiteness, the instants t,„ are taken to constitute a Poisson point 
I'l.nrss: the number n of such instants in an interval of duration A has a Poisson 

■ ii .nilmlion with expected value XA, where k is the average number of pulses per 
'! 1 1 u I mie: 

Pr(n ■= k) = ( -^Lc~ KA , k = 0, 1, 2, ... . (2-28) 
kl 
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The numbers of pulses in any disjoint intervals are furthermore statistically inde- 
pendent [HeI91, p. 390], [Pap91, pp. 357-8]. The amplitudes a m are taken to have 
expected value zero. The white noise is thus a kind of shot noise. By Campbell's 
theorem [Hel91, p. 397], [Pap91, p. 360] the autocovariance function of this random 
process is 

E[v(t)v(s)] = \E{a 2 m ) f 8(w - r)S(« - s) du 

J-» (2-29) 

= \E(ai)Ht - s). 

White noise is modeled as in (2-27) in the limit in which the rate X grows beyond 
all bounds and the mean-square amplitude E(a„) vanishes in such a way that their 
product \E(al) = N/2 remains fixed. We call this the high-rate limit. 

When the random process of (2-27) passes into a linear filter whose impulse re- 
sponse is *(t), each pulse a m 8(/ - i m ) causes the filter to put out a copy a m k{t - T m ) 
of that response, and the net output is the shot-noise process 



*>o(0 



rco oo 

= k(T)v(t — t) di — Y a m k(t - t„). (2-30) 

J-00 



The central-limit theorem assures us that for a broad class of probability distributions 
of the amplitudes a m , the output v (t) of a linear filter, as defined by (2-30), will 
be a Gaussian random process in the high-rate limit. From this standpoint it is 
unnecessary to define the white noise itself as a Gaussian process. 

The integrals for the samples v k as defined by (2-13) will have the form of a 
summation like that in (2-30): 

v k ~ X a w f k {*„). (2-31) 

T,„e(0,r) 

By virtue of the central-limit theorem, they too will be Gaussian random variables 
in the high-rate limit, and the joint probability density function of any number of 
such samples will be given by (2-26). 
Let us consider the sum 

CO 

8 = £ (2-32) 

k=\ 

where the fy's are samples of white noise in this sense, and the s^'s are the Fourier 
coefficients of the signal s(t) with respect to the same set of orthonormal functions: 

oo 

ff (0 = X stMO- (2-33) 
Substituting (2-31) into (2-32), we obtain 

co fT 
g ~ X X a n s kfk(?m) = £ d m s(t m ) = s(t)v(t) dt, (2-34) 
fc=l T m e(0,T) t m 6(0,D Jo 
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when v{t) is represented as in (2-27). If we like, we can regard (2-34) as a definition 
of what is meant by an integral of the form 

rT 

o 

when v(t) represents white noise. 
2.1.6 Karhunen-Loeve Expansion 

Sailing up likelihood ratios appropriate for the detection of signals in Gaussian noise 
is much simpler when the samples Vk arc uncorrelaled and hence independent. We 
haw seen that when the noise is white, samples defined as in (2-13) in terms of an 
arbitrary set of orthonormal functions//;- (0 arc uncorrelaled. It would be convenient 
lor the samples to be uncorrelaled even when the noise is not white, but is described 
1w an arbitrary autocovariance function tf>(7, s) as in (2-3). (Such noise is said to be 
- nlovcd.) This will be so if the orthonormal functions utilized are the solutions of 
itie homogeneous integral equation 

rT 

\kfk0) = s)f k {s)ds, < / < T, (2-35) 

Jo 

whose kernel §(t,s) is the autocovariance function of the noise. Our observation 
interval remains (0, T). Nonzero solutions of this equation exist only for special 
\ alues of the constants those values are called the eigenvalues (or "proper" or 
i haracteristic" values) of (2-35). The associated solutions//; (/) are called the eigen- 
!u)i, lions of (2-35). (How to solve integral equations of this kind will be discussed 
hi Sec. 2.3.) When these eigenfunctions/iTO arc used in the scries (2-12), 

CO 

k-\ 

Hit- series is called the Karhunen Loeve expansion of the input v{t) [Loe45], [Loe46], 
|k:n--(7|. 

The equation (2-35) is called a homogeneous Free/holm integral equation. For 
ilu- sake of future applications, we allow its kernel c|>(?. .v) to be complex, but wc 
imp. isc on it the symmetry property 

<b(/..v) = /), (2-36) 

■ I if iv the asterisk denotes the complex conjugate. When such a homogeneous inte- 
i.il equation arises in detection theory, its kernel satisfies this condition. The kernel 

i ■ i hen said to be Hernutian; if it is also real, it is described as symmetrical. 

Che theory of integral equations such as (2-35) is described in a number of 
i'.M.ks; [Lov24j, [Cou31], [Hil53], and [Mor53], to name a few. The theory is akin to 
1 1 i.i [ of linear operators, which act on vectors in a Hiiberi space of an infinite number 

■ -I dimensions; each vector corresponds to a function defined over the interval < 

/ . The components of such a vector arc the coefficients of the Fourier series for 
'I:.- function with respect to the particular set of orthonormal functions adopted as 
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a basis. Thus to the function s(t) corresponds the vector (s\,S2, ... , s*, ... ), where 
the sk's are the coefficients in the Fourier series (2-33) for s(t). 

Multiplication of a function of s by <f>(/, s)ds and integration over < s < T 
to yield a new function — now of t — constitute a particular type of linear operation. 
A linear operator rotates and stretches the vectors on which it acts in such a way 
that the transformed sum of two vectors is the sum of the transformed vectors, and 
so on. An integral like 

f f*(t)g(t)dt 
Jo 

corresponds to the scalar product (f, g) of the vectors representing the functions /(/) 
and g(t). 

First we shall prove that the eigenfunctions of (2-35) possess an orthonormality 
property like that in (2-11). To do so, we first multiply both sides of (2-35) by/ m *(0 dt 
and integrate over (0, T): 

^ff:(t)A(t)dt = \ r ( T f*(tMt,s)Ms)dtds. (2-37) 
JO Jo Jo 

When we take the complex conjugate of (2-35) and write it for the mth eigenfunction, 
it becomes ^ 

KSM = f Vis,t)f*{t)dt = f f*{tmt,s)dt 

Jo Jo 
on account of the Hermitian character (2-36) of the kernel. If we multiply both sides 
of this equation by fk(s)ds and integrate, it becomes 

K, f/Mfkis) ds = C f T f*{tm, s)f k {s) dt ds. 
Jo Jo Jo 

Subtracting from (2-37), we find 

(X*"- \*J \ T f*(t)Mt) dt = 0. (2-38) 
Jo 

In most problems the eigenvalues X* are all distinct. Then for k f m the associated 
eigenfunctions must be orthogonal in the sense appropriate for complex functions 
on the interval (0, T): 

\ T f*{t)fk{t)dt =0, k *m. 
Jo 

If two or more eigenvalues are identical, new eigenfunctions can be formed as linear 
combinations of their associated eigenfunctions in such a way that they are or- 
thogonal; one applies the Gram-Schmidt procedure described in Sec. 2.1.4, suitably 
modified if necessary to accommodate complex functions. 

For k. = m, on the other hand, the integral in (2-38) is positive, and \ & — \^ = 
0; all the eigenvalues \ k are therefore real. From the form of (2-35) we see that it 
specifies an eigenfunction only up to an arbitrary multiplying constant, which can 
be chosen so that the integral of the absolute square of the function equals 1 . Then 
for all indices 
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rT 



f*(t)Mt)dt = 5,,„, = 



(2-39) 



1, k = m, 
0, k * m. 

The eigenf unctions form an orthonormal set in this sense. In the vector-space anal- 
ogy, they correspond to a mutually orthogonal set of vectors of unit length. 

In detection theory it suffices to assume that the kernel <|>(r, s) is positive 
definite, which means that for any function g(t) that is not identically zero 



E 





rT 


2 




rT 




g(t)v(()d( 


\ffo 








JO 




. 


. 



rT 



g\t^s)g(s)dt ds > 0. 



(2-40) 



The eigenvalues k k are then all strictly positive, for from (2-35) and (2-39) we obtain 

f r |A(0l 2 dt = k k = CCtfmd, s)f k (s) di d s > o. 

Jo Jo Jo 

The linear operator represented by the kernel (}>(/, s) changes the lengths of the 
orthogonal vectors corresponding to the eigenfunctions fk(t), but it does not rotate 
them. A positive definite linear operator 4>(M') does not nullify or reverse the 
direction of any of these basic orthogonal vectors. 

If the kernel §(t,s) is real and symmetrical, the eigenfunctions can also be 
Uiken to be real. Indeed, upon taking the complex conjugate of (2-35), we find that 
hoili //,(/) and f k {t) are eigenfunctions of the kernel <})(/, s) corresponding to the 
.same eigenvalue. The real and the imaginary parts of /*(')> as linear combinations of 
iIil-sc. must therefore also be eigenfunctions, and we can use them instead. Ordinarily 

//.U) ./;(o. 

Returning now to noise with a real autocovariance function e|>(/,.v), we can 
assume that its eigenfunctions in (2-35) are real. We can then use (2-1 1) and (2-16) 
t<> show that the covariances of the samples v k are 



47* - E(vjv k \ H Q )= CI fjOWt, s)Ms) dt ds 
Jo Jo 



Xft fj(t)f k {t)dt = k k ?>j k = 
'o ( 0, 



./" = k, 
j * k. 



(2-41) 



I In- variance of the sample v k equals the associated eigenvalue k k , which, as wc have 
< rn. must be positive: \ k > 0. 

If we write a Fourier expansion of the type in (2-12) for 4>(/, ,y) considered as 
i I unction of/, its kih coefficient <fy k must, by (2-13), equal 

rT 

$k = f k {u)$(u>s)du = k k j k (s) 
Jo 

•\ (2-36) and (2-35), and the Fourier expansion of <j>(f, s) becomes 



k=] 



(2-42) 



h is known as Mercer's formula. 
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2.1.7 Reproducing-kernet Hilbert Space 2 

In Sec. 2.1.3 we found that with white noise there is an unlimited number of sets of 
orthonormal functions f k {t) with which one can sample the input v(t) and obtain 
samples that are uncorrected and, being Gaussian random variables, independent. 
In Sec. 2.1.6, on the other hand, it appears as though when the noise is colored, 
only a single set would serve that purpose, specifically, the set of eigenfunctions of 
the integral equation (2-35). In order to find an analogous plenitude of sampling 
functions that will produce uncorrelated samples, the definition of orthogonality 
must be modified. It is now based on a new definition of the scalar product of two 
functions, and to distinguish the new scalar product from that in (2-20), we mark it 
with angular brackets. 

For functions h(t) and m{t) defined in the interval (0, T), this scalar product 
is defined by 

(h, m) = (m,h) = f H(t)m(t)dt = f h(t)M{t)dt (2-43) 
Jo Jo 

where H(t) and M(t) are solutions of the integral equations 

Kt) = \ T 4>(t,s)H(s)ds, 

% (2-44) 

m(t) = $(t, s)M(s) ds, < t <T. 
Jo 

As before, $(t , s) is the autocovariance function of the noise, and we are here as- 
suming it to be real and as in (2-36) symmetrical: <$>(t, s) - <j>(j, t). We call H(t) 
and M(t) the cofunctions of h(t) and m{t), respectively. 

The norm of a function is its scalar product with itself: 

||A(0ii = (h, h> = [ T h(t)H(t) dt = C [ T H(tWt, s)H{s) ds dt. (2-45) 
Jo Jo Jo 

Because $(?, s) is positive definite — see (2-40) — , the norm is always positive unless 
h{t) = 0. The theory deals only with functions of finite norm in this sense. The 
square root of the norm is analogous to the length of a vector, and the functionals 
<h, m) have properties analogous to scalar products in an ordinary Cartesian vector 
space. 

The functions h{t) and m(t) are now termed orthogonal if (h, m) = 0. A set of 
functions {/*(?)} is said to be orthonormal if 

= 8* = -P' J=k > (2-46) 
(.0, j * k. 

The eigenfunctions of the integral equation (2-35) are orthogonal in this new sense, 
and if after normalization as in (2-39) they are multiplied by \^ 2 , they acquire norms 
ll/*(f)ll equal to 1. 

2 Reading this part can be deferred until Sec. 2.2.4. 
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From a set of linearly independent functions g k (t) of finite norm, a set of func- 
imns/frfO orthonormal as in (2-46) can be constructed by the Gram-Schmidt pro- 
! akiie. It is merely necessary to replace the scalar product ( ■ , ■ ) used in Sec. 2.1.4 
in ihe scalar product { ■ , ■ ) introduced here. We define samples of the input v(t) by 



v'k ~ fa, fk) = 



rT 



F k (t)v{t)dt, VA, (2-47) 



a here f\ (() is the cofunction to f k (t). These samples are uncorrected, for their 
variances are 

Cov(v ! k ,v'„,) = E(v' k v'J H ) = f [ F k (t)F m (s)E[v(r)o{s)\H Q ]dtds 

Jo Jo 

= f F k (tm>s)F m {s)dtds = f F k {t)f m {t)dt 
Jo Jo Jo 

= (h, f»i) = 8*m» 

ukI ihe samples v' k , being Gaussian random variables, are statistically independent. 

The autocovariance function tj>(r, u) of the noise is called the kernel of the space 
■i hmcfions having finite norm in the sense of (2-45). Considered as a function only 
•I /, with u a parameter, it has the so-called reproducing property 

<A(r), (|>(r, «)> = h{u), < u < 7\ (2-48) 

> ' i ;my function in the space. To demonstrate this, we again denote the cofunction 
■I lnt) by H{t), whereupon the scalar product in (2-48) is 



Jo 



$(u,t)H{t)dt ~ h(u) 



'- i Ik- symmetry of <$>(t, u) and by (2-44). Because of (2-48), a function space with 
< > .il;ir product defined as in (2-43) is called a reproducing kernel Hilbert space. 

As wc shall see in Sec. 2.3, although cofunctions such as H(t) and M(t) in 
' II) often contain delta functions and derivatives of delta functions, the scalar 
'I'hIucI (h, m) of h(t) and m(t) can— for a large class of autocovariance functions — 
■■ \\ i iiicn in a form that is free of such "pathological" entities. If one has by some 
■ I- -.ins worked out the form of the scalar product {h, m), one needs only to verify 
1 1. 1 1 1 1 possesses the reproducing property (2-48). The detection of signals in colored 
..ni'Aian noise has been analyzed in the framework of the reproducing kernel Hilbert 
I'.hv by Kailath and others [Kai67], [Kai7I], In particular, the latter exhibits the 
<■ 'i in-, arising from a variety of autocovariance functions, and from these the form 
■i i Ik- scalar product (h, m) can be directly derived by the rule 

j|A(0 + m(t)\\ = \\h(t)\\ + \\tn(t)\\ + 2(h, m). 
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2.2 DETECTION IN GAUSSIAN NOISE 



2.2.1 The Likelihood Ratio 

The input v{t) is observed during an interval (0, T). We seek the optimum strategy 
for deciding between the hypotheses 

H : v{t) = n(i), ' < t < T, 

and 

H x \ v(t) = s(t) + n{t\ <t < T, 

where n{t) is Gaussian noise with autocovariance function 

E[n{t)n{s)] = E[v(t)v(s)\ H ] = $(t, s), 

and s{t) is a signal of known form and amplitude. For simplicity we assume that 
s(t) vanishes outside the interval (0, T). If the noise is stationary, its autocovariance 
function has the form <$>(t,s) = (f>(* — s), and it possesses a spectral density ^(to) 
as in (2-7). Such noise is called colored to distinguish it from white noise, whose 
spectral density is uniform and whose autocovariance function is proportional to a 
Dirac delta function as in (2-24). Even if the noise is nonstationary and possesses 
no spectral density, we refer to it as colored noise. 

The input v{t) is sampled in terms of a set of functions fk(t) orthonormal over 
(0, T), as described in Sec. 2.1.6, with the samples defined by 

»* = i T fk(0v(t)dt, (2-49) 
Jo 

in which the functions fk(t) are the eigenfunctions of the integral equation (2-35), 
whose kernel is the autocovariance function $(/, s) of the noise. We are expressing ■ 
the input v(t) in a Karhunen-Loeve expansion 

00 

v(t) = X (2-50) 

k=\ 

For the moment we base our decision on only the first n of these samples, v\,v-i, ... , 
v„. Later we shall let n grow beyond all bounds. 

We have seen that these samples are statistically independent Gaussian random 
variables. As in (2-41) their variances are 

Varwt =X*, (2-51) 

where \* is the fcth eigenvalue of the integral equation (2-35). Under hypothesis 
Hq, as in (2-15), their expected values are zero. Under hypothesis Hi, on the other 
hand, when the signal s(t) is present, E[v(t)\ H\] = s(/), and 

E{v k \ H { ) = [ T f k (t)s{t) dt = s k , (2-52) 
Jo 
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where .sy, is the Arth coefficient of a Fourier expansion of the signal .v(/) in terms of 
Hie orthonormal cigenftinctions/i(M of (2-35); 



k=i 



The joint probability density function of the samples z\ under hypothesis H v 
i. thus 

■ P°(vu vi v a ) = Mv) = fj ^= cx p(~|; 

I Muler hypothesis H\ the noise component of the Arth sample is - .va, and the 
Mini density function of the samples v\,v 2 , ... , v n under hypothesis //] is 



p\(V\, v 2i ... , v„) ~ p\{v) - [~] 



exp 



(Vk ~ -va-) : 
2X* 



I lie optimum strategy, as we learned in Chapter I, bases the decision on the likeli- 
hood ratio 



a / \ _ P\{v) _ 

A„(l7) = 



Pdv) H [ 



~ (v k - s k ) 2 
2\ k 



cxp]T 



Sk v k 



^k 



JUL 

2\k 



The data v appear in this expression only combined as in 



a. = I 



S k Vk 



(2-54) 



;nid (he likelihood ratio A„(z>) is a monotone increasing function of g„. Hence, 
according to Sec. 1.2.4, the decision can just as well be based on (he sufficient 
Matislic g„, which will be compared with some decision level !> th] \ if g,, > g fill . H\ 
^ chosen, otherwise Ha. Let us evaluate the false-alarm and detection probabilities 
characterizing this strategy. 

Because defined by (2-54), is a linear combination of the Gaussian random 

variables Vy,Vi v„, it too has a Gaussian distribution under both hypotheses. 

1 hidcr hypothesis Hq the expected value of g„ is zero because E(v/A Ho) = 0. Its 
variance is 

■ s k \ ... .... i > _ 'V, 



^ = Var(£„| //„) = X ( X7 ) Var(lV 1 Ht>) = X i 



kk 



hv (2-51). Its probability density function is therefore 
and the probability Qq of a false alarm is 



Go - -=== I cxpl -^V ) dg„ = crfc 



(Han \ 

I iy (1-32), where erfc( ■ ) is the error-function integral defined in (1-11). 



(2-55) 



(2-56) 



(2-57) 
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Under hypothesis H\ the expected value of the statistic g„ is 

E(g»\ Hi) = f.^E(v k \ HO = 

k=i kk Xk 

by (2-52). Its variance is the same as under hypothesis Ho because g n in (2-54) is a 
linear combination of the variables Vk, and the terms due to the signal will cancel 
out when Var(g„[ Hi) is calculated. Hence the probability density function of the 
statistic g n under hypothesis Hi is 

^--^m^ 11 ^ 1 ] (2 - 58) 

and the probability of detection is 

& = erfc (^) = erfc fe--) 

by (1-33). Given the false-alarm probability Q , we solve 

Qo = erfc x (2-60) 
for x, whereupon the probability of detection is 

Q d = erfc(* - a„\ (2-61) 

where ct„ is given by (2-55). 

The more samples v u v 2 , ... , v„ of the input v{t) we utilize, the larger the 
probability Q d of detection, for crj as defined by (2-55) increases with n. Hence, as 
we might expect, the maximum probability of detection is attained by utilizing all 
the samples, letting n go to infinity, and basing the decision on the statistic 

Similarly passing to the limit n oo in (2-55), we find that 

w 2 

oj — Var^ = d 2 = V (2-63) 

The random variable g defined by (2-62) makes sense only if its variance d 1 is finite. 
The quantity d 2 is the basic signal-to-noise ratio in this detection problem. 

2.2.2 The Sufficient Statistic 

Just as the function s(t) corresponds to the set of signal samples {^} through (2-53), 
in which the f k (t) are the eigenfunctions in (2-35), so we can define a function q{t) 
by the Fourier series 

9{t) = X ?*/*('), = (2-64) 
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Then by vin uc of the orthonormality of the eigenfunctions, we can write our suffi- 
cient statistic as 

ff = X = £ fkUMOdt = q(i)v(t)dt, (2-65) 
by (2-13). 1 urtbermore, by (2-35), 

co *T 

<K', u)q(u) du - y q k \ ()>(/, u)f k {u) du 

CO CC 

whence </(/ 1 is ihe solution of the inhomogeneous integral equation 

s(i) = f <(>(*, u)q(u)du, < / < T. (2-66) 
Jo 

In the ne\i suction we shall consider methods of solving this fundamental integral 
equation. 

The siiuistic'^ can be generated by multiplying the input v(t) by q(t) and 
integniiiii!' over the observation interval (0, T). One speaks of correlating the input 
with tjd ). ;'nid a receiver that does so has been called a correlation receiver. We 
shall sec m Sec. 2.2.3, however, that it is simpler to produce the statistic g by linear 
filterini' <>f !'■</). 

liy repricing Vk by s k in (2-65) and comparing with (2-63), we see that we can 
write the sirnal-to-noise ratio d 2 as 

rT 

d 2 = s(i)q(t)dt. (2-67) 
Jo 

Under hypoihesis Hq the expected value of our statistic g equals 0; under hypothesis 
fi it is ^ 

E{g\ Hi) = f q{t)E[v{t)\ H x Ut = f q{t)s{t)dt = d 2 . 
Jo Jo 

%g is ;i ( i.iussian random variable and its variance equals d 1 under both hypotheses 
(2-63). iis probability density functions under the two hypotheses are 



2] . 



^' d2)2 ' (2-69) 



2d 2 

likelihood ratio of our statistic g is therefore 

Mg) = E} ¥\ = expfe - U\ (2-70) 
i>y (2-o5) and (2-67) this can be written in terms of the input v(t) as 
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A[v(t)] = exp 



J o q{t)v{t) dt - \^s(t)q{t) tff J. (2-71 



We call this the likelihood functional for detection of the signal s(t) in Gaussian noise 
having autocovariance function tf>(r, s). 

Under the Bayes criterion, the likelihood functional is compared with the quan- 
tity A given by (1-17) in terms of the prior probabilities Co and £j of hypotheses H (] 
and H\ and the costs Qj attending the various combinations of decision and true 
hypothesis. Equivalently, hypothesis H\ is chosen whenever 

g>go = ^ 2 + inA . 

By passing to the limit n -* oo in (2-57) and (2-59) or by putting (2-68) and (2-69) 
into (1-32) and (1-33), we find that the false-alarm and detection probabilities have 
the forms 

Go = erfc*, x = ^ , Qd= tdc(x - d), (2-72) 

where d 2 is the signal-to-noise ratio defined in (2-67). 

When the Neyman-Pearson criterion is used, the value of the false-alarm prob- 
ability Qo is fixed in advance, usually on the basis of the relative frequency of errors 
of the first kind that the observer can tolerate. The decision level g$ is then deter- 
mined from (2-72) as go - dx. 

In Fig. 2-1 the probability of detection Q d has been plotted against d for a 
number of values of the false-alarm probability Q . The pair of equations (2-72) 
represents in parametric form the operating characteristic of the statistical test or 
the detection strategy; a number of these are shown in Fig. 2-2 for various values of 
the signal-to-noise parameter d. 

It is often convenient to describe the effectiveness of a receiver by quoting the 
signal-to-noise ratio of the minimum detectable signal, that is, the value of d 1 required 
to attain a certain probability Q d of detection for a given false-alarm probability go- 
lf, for instance, the values adopted are Qo ~ 10~ 6 and Q d ~ 0.99, we obtain from 
(2-72) the ratio d 2 = 50.12 = 17.01 dB. 

When the noise is white with unilateral spectral density N, its autocovariance 
function is 

4>(M) = y8(/- s ) 
as in (2-24), and substitution into (2-66) shows that 

q{t) = ~s(t). (2-73) 
The likelihood functional in (2-71) becomes 

AHO] = expj^J^MO dt - ~^[s(t)f dt^. (2-74) 
The detection statistic can now be taken as 

g = f s(t)v(t)dt, 
Jo 



46 Detection of a Known Signal Chap. 2 



0.0001 







2 



4 



6 



8 



6 



Figure 2-1. Probability of detection: completely known signal in Gaussian noise. 
Curves are indexed by the false-alarm probability Qc- 



the factor of (2/N) being absorbed into the decision level go. The signal-to-noise 
ratio becomes 



of the signal. In Sec. 2.4 we shall present a physical interpretation of this basic 
signal-to-noise ratio. 

That the statistic g defined in (2-65) is optimum for deciding between the 
hypotheses Ho and H\ was demonstrated by Grenander in 1950 [GreSO], but under 
the assumption that the solution q(t) of (2-66) is square integrable over (0, T). 
We shall see in Sec. 2.3 that that solution often contains delta functions and their 
derivatives, which are not square integrable. Kadota [Kad67] has shown that g is 
optimum even under those conditions, provided that the signal-to-noise ratio d 2 in 
(2-67) is finite. 




(2-75) 



in terms of the "energy" 




(2-76) 
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Figure 2-2. Receiver operating characteristics: completely known signal in Gaus- 
sian noise. Curves are indexed by signal-to-noise ratio d. 



When, as ordinarily, a portion of the noise «(/) is white and the energy of 
the signal is finite, the signal-to-noise ratio d 2 in (2-67) will be finite, and errors 
have nonzero probabilities under both hypotheses. If the white component has been 
neglected in modeling the noise in a detection problem, it is possible for the signal- 
to-noise ratio d 2 to turn out to be infinite. Examination of (2-56) and (2-58) shows 
that when we make the number n of samples larger and larger, the probability density 
functions of the statistic g n move farther and farther apart; and if d 2 = co, we can, 
by setting a decision level midway between them, cause the probabilities Q and 
Q\ - 1 - Qd of errors of the first and second kinds to go to zero. A situation like 
this, which is of hardly more than mathematical interest, results in what is known as 
singular detection. Conditions under which it may occur have been extensively studied 
[Roo63], [Yag63], [Kai66a]. A rigorous treatment of this matter can be found in the 
book by Poor [P0088]. For those who wish to delve into the literature on this topic, 
we mention that in its terminology the "equivalence" of the probability measures 
associated with v(t) under the two hypotheses means that detection is imperfect; 
"perpendicularity" of the measures means that error probabilities are zero and the 
singular case is at hand. What we have called the likelihood functional is often 
termed the Radon-Nikodym derivative of one measure with respect to the other. 

An indication of the kind of noise model that may imply singular detection 
can be obtained by studying the detection of a signal s(t) in stationary Gaussian 
noise on the basis of observation of the input v{t) during an infinite interval. The 
integral equation (2-66) then becomes 
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s(t) = 



$(t - u)q m (u) du. (2-77) 



Uri ause il has the form of a convolution, it can be solved by Fourier transformation, 

e«(«)e™'~, (2-78) 

in ■icrms of the spectral density <E>(io) of the noise and the spectrum S{m) of the 
■\\w\w\. In this limit of an infinitely long observation interval, the signal-to-noise 
i.uio (2-67) becomes 



di - 



"%(«),-(o <n - r = r m^. (2-80 

-co J-oo 2li J_ m <P((1)) 2-TT 



The longer the observation interval is, the more information in the input v{t) 
is utilized for deciding between the two hypotheses, and the higher the probability 
(J,i of detection for a given false-alarm probability Qq must be. The effective signal- 
lo-noise ratio d 2 must therefore increase with the length of the interval (0, T), and 
ihu quantity d 2 , of (2-80) must represent an upper bound to the signal-to-noise ratio 
loi any finite interval, 

d 2 < dl 

If d^ is finite, d 2 must be finite, and detection wilt not be singular. This will be 
i he case when the spectral density <t>(ui) of the noise is nonzero for all co and drops 
off to zero as |to| — * oo more slowly than |5(w)f 2 and, in particular, when the noise 
contains a white component whose spectral density is uniform at all frequencies. If, 
on the other hand, d 2 = oo, as would happen, for instance, if the- spectral density 
<!>(«) of the noise vanished in any frequency band where the spectrum S(o3) of the 
signal did not, detection might be singular even for a finite observation interval. 

2.2.3 The Matched Filter 

The sufficient statistic g in (2-65) can be generated by passing the input v(() through 
a filter matched to the "signal" q{t). The impulse response of this filter is 

U(r- T ), o< T <r, 

I 0, t < 0. K ' 

It is customary, although unnecessary, to take &(t) = for t > T as well. When 
the input v{t) is turned on at time t = 0, the output of this filter at a later time / in 
< / < T is 

vo(t) = f k(r)v(i - t) d-r - f q(T - t)v(i - t) di 
Jo Jo 

= q{u)v{t — T + u) du, 

JT-t 
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and at time T the output equals 

V (T)= f q{u)v{u)du = g. 

Jo 

The optimum receiver therefore consists of the matched filter k{ ■ ) and a sampling 
device that samples the output v (t) of the filter at the end of the observation interval 
(0, T) and compares it with a decision level go; if g exceeds g , the receiver decides 
that a signal is present. 

Among all linear filters through which the input v(t) might be passed, the 
matched filter yields the largest output signal-to-noise ratio 

d ™ VartfcOT)- (2_82) 

Here s (t) and v„(t) are the outputs of an arbitrary linear filter when the signal and 
the noise are, respectively, alone present in the input, which we now suppose turned 
on at time / = 0. (When the noise is nonwhite, that is, correlated, its values for t < 
would provide information enhancing signal detectability even though the signal 
started after / = 0.) With &(t) the impulse response of the filter, these components 
of the output are at time T 

sq(T) = f k(r)s(T — t) dj = f k(T - u)s{u) du, 
Jo Jo 

v„(T) = f k{T - u)n{u) du. 
k 

Because the noise output v n {T) has zero expected value, its variance is 
V2LXv n {T) = E§v n (T)f\H Q } 

- k(T - u\)k{T - u 2 )E[n(ui)n(u 2 )\ du\ du 2 

Jo Jo 

= | I k(T - Hi)<Ktfi, u 2 )k(T - u 7 ) du x du 2 , 
Jo Jo 

where <{)(■, * ) is still the autocovariance function of the noise. 

Let us write both s(i) and k(T - t) as Fourier expansions in terms of the 
eigenfunctions f k (t) of (2-35): 

CO CO 

s{t) = £ sMt), k(T - = X k m f m (t). 

«=1 m=\ 

Then when we use (2-35) and (2-39), we can write the output signal-to-noise ratio 
as 

■a r / , ,„ w ... s-a 



^out 



The Cauchy-Schwarz inequality for sequences states that for any two sequences 
(a u a 2 , ... ) and {b u b 2 , ... ) 
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-i 2 



]T a »> b » 



(2-83) 



71 = 1 



with equality if and only if b m = ca„ )} Vm, for some constant c. Taking a m ~ X^ 2 £ m , 
h >» = Kn n s,n, we find 



00 i W 



oo P 2 

X 

with equality if and only if k m = cs m /\,„ - cq m \ that is, 

k{T - i) = a/(r). 

The maximum possible output signal-to-noise ratio is therefore given by d 1 of (2- 
67) and is attained by the matched filter specified by (2-81). The concept of the 
imilched filter was proposed by North [Nor43] and applied to this decision problem 
by Peterson, Birdsall, and Fox [Pet54]. 



2.2.3.1. The matched filter for white noise When the noise is white, the 
function q{t) is proportional to the signal s(t), as in (2-73). The constant of pro- 
portionality can be absorbed in the decision level g G , and the impulse response of 
(he matched filter for detection of this signal in white noise can be taken as 



* (T) -|o, 



< t < T, 

T < 0, T > T. 



(2-84) 



lixample 2-1 Triangular pulse in white noise 

Let the signa! be a triangular pulse of the form 

< / < V, 



s{t) = 



o, / < o, / > r. 



It is illustrated in Fig. 2-3. The observation interval is still (0, T% with T > T', and 
die impulse response of the matched filter is 



m = 



T - t, T ~ V < t < T, 

0, t < T - T' t t > T, 



;is shown in the same figure. Then the output of the filter when only the signal s{!) is 
applied to it is 



*o(0 = 



0, 

{(\T- t{ + 2T')(T'- \T - t\) 2 , 
[0, 



/ < T - V, 

T - T' < ( <T + T\ 

t > T + V. 



1 1 is shown in the lowest part of Fig. 2-3. This "output reaches a maximum at I = T, 
i he time at which the decision is made. For a pulse of duration limited to (0, T% there 
is no point to making the observation interval longer than V . 



Detection in Gaussian Noise 



51 



s(t) 




f T t 

Input signal 




s {t) 




If the signal is confined to the observation interval (0, T) t as we have supposed, 
the transfer function of the matched filter is proportional to the complex conjugate 
of the spectrum 

= \ T s(t) e~ ,ua dt 
Jo 

of the signal. It is defined by 

y(a)= f k(v)e* m dv= C s(T - t) e~>™ dt 

J-oa Jo 

= s(u)e- t * T -">du = e- iaT S*(<a). 
Jo 



(2-85) 
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i he factor exp(-i'co!T) corresponds to a delay of T seconds in the response of the 
filler. 

The spectrum of the signal component so(t) of the output of the matched filter 

IS 

j (O dt = y(u)S{u>) = e-'^lS^W 



and hence that component is 

So(t) = 

because |S(co)| - ^(-w)!, this is 
so(t) = 



| S(w) |2 e Mr-r)^. 



|S(to)[ 2 cosco(/-:T)~, 
, 2tt 



and we see that the signal component of the output of the matched filter is an even 
function of (t - T): 

s (T - x) ~ sq{T + x). 

flie matched filter can be constructed by methods that have been developed for 
synthesizing filters with prescribed transient response [Gui57], [Yen64], Matched 
filters find extensive application in signal processing; Turin has written a thorough 
r eview of this subject [Tur60a]. 

2.2.3.2. The matched filter for stationary colored noise The impulse re- 
sponse of the matched filter for detecting a signal in colored Gaussian noise has been 
;;iven in (2-81). When the noise is stationary and the observation interval, which we 
now take as (-T, T), is much longer than either the duration of the signal s(t) or 
ihe width of the autocovariance function <(>(t), the solution q(t) of (2-66) will be 
close to the function q<»{t) defined by (2-78) and (2-79), and the transfer function 
of the matched filter in (2-81) will be approximately 

y»(a>) - g-(o>) e~^ T = * Wwr ^v (2-86) 

[ I)wo50]. If the spectral density of the noise can be factored as 

0(a>) = r(w)r(to), (2-87) 

where the function r(a>) contains all the poles and zeros of the spectral density <J>(to) 
lying above the real axis in the co-plane, the transfer function ^ M (w) in (2-86) can be 
expressed as 

Vc(co) = yi(a))y 2 (a>) (2-88) 
with + 

r(w) r*(w) 

;tnd the matched filtering can be carried out in two stages, as shown in Fig. 2-4. 

At the output of the first filter, the spectral density of the noise is uniform, 
<l>i(w) = 4>(a))|vi(co)| 2 = 1; the noise rt](t) at that point is white. For this reason the 
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Figure 2-4. Matched filter for a long observation interval: ^i(o))^2(<») - 
exp(-iwr>5*(o))/*I>(fc)). 



filter whose transfer function is y\ («) = 1/T>>) is called a whitening filter [Bod50]. 
The spectrum of the signal s\(t) at that point is 

= S(<%i(») = 

r(w) 

This filter is causal because all the poles and zeros of y\(t&) lie above the Re (d-axis. 
Its impulse response £i(t) is nonzero only for t > 0. 

The task of the second filter is to facilitate the detection of a known signal, 
s\(t), in white noise, and (2-85) and (2-89) show that it has indeed the proper transfer 
function. A system such as this can be realized only approximately and even so only 
by accepting a long delay T, but it serves to elucidate the results of our mathematical 
analysis. The delay must be long enough so that the tail of the output si(t) of the 
whitening filter is negligible. 

This approximate realization of the matched filter provides an alternative way 
of calculating the maximum attainable signal-to-noise ratio (2-80), 



J -co 




dt, (2-90) 



which follows from (2-75) because at the output of the whitening filter the signal is 
s\(t), and it is immersed in white noise of unit bilateral spectral density, N/2 s 1. 

Example 2-2 Triangular pulse in colored noise 

Let the signal s(t) have the same triangular form as in Example 2-1: s(t) = t, < / < 
T'; s(t) m 0, t < 0, / > T'. The noise is a combination of white noise having unilateral 
spectral density N and noise having a Lorentz spectral density with variance tf> : 

*(„) = £ + - 



2 a 2 + <o 2 ' 
its autocovariance function is 

<Kt)* j8(t) + 4>o e-"W. 
We write the spectral density as 



which can be factored as in (2-87) with 

_ (NV n p + m 

} fi, + I to 
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The transfer function of the first filter in Fig. 2-4 is 

yM -\Nj pTto-UJ L'-pTTS} 

and its impulse response is 

/ 2 \ l/2 

*i(t) = (^J [5(T)-((J-^)e-P T t/(T)]. 



The signal component of the output of this whitening filter is 

j,(f) = £*,(tM' - t) A = (|)' /2 [5(f) - (P - jl) J V^M' ~ t) rfr] 

s 0, / < 0, 
/ 2\ i/2 

= (at) p^W' + tf - »00 - *"*)!. o < r < r, 

/ o \ 1/2 

= ~ (£) P~ 2 (P " ~ 1 + e-*T')e-^ T \ t > T'. 

It is sketched in Fig. 2-5(a). The second filter in Fig. 2-4 is matched to this signal s\(t) 
with a delay T that needs only to be long enough so that the tail of s\(t) is insignificant: 

p(r - t 1 ) » l. 

The solution #«>(/) of (2-77) is the inverse Fourier transform of 



AT p 2 + w 2 



and can be considered as the output of a noncausal filter whose input is the signal s(t) 
and whose impulse response is 



Mt) = jj\ 5(t) - 



2P 



-Phi 



] 



For our triangular signal this is 



^[l-e^(pr + l)]^. 



/ < 0, 

< t < T\ 

t > r. 



It is sketched in Fig. 2-5(b). 

The signal component of the output of the entire matched filter (2-81) when the 
delay T is so long that T » 7", p(r - T') » 1, is 

l^(< a )l 2 --.- M (i-n rf<B 



which by the convolution theorem can be expressed as 

.*,{/) = J JtwyMF - f + v) dv. 
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A somewhat tedious integration yields for this output so(t) = s(\t - T\), where 
s(t) = (Ntfr^l^Mt) - (3 2 - ^ 2 )[20/ + e -*'[2 - P 2 r' 2 - e-* T '{\ + $T')\ 

-or' + d^'-^JJ, o<t <r, 

= -(tfp 5 )-'0 2 - u. 2 )[l - e- p7 "(P7" + l)](pr' - 1 + eH^-H'- 7 *), / > T', 

with 

HO = K 2 ^ + 'XT' - 2 

(he form of the output of the matched Filter in Example 2-1 for < t < V. The 
output signal so(t) is sketched in Fig. 2-5(c). In particular, the upper bound on the 
signal-to-noise ratio in (2-80) is obtained by setting t = T: 

dl = s (T) = 5(0) 

= - W - m- 2 ) [2 - P 2 7" 2 - 2 (1 + pr') e - p7 "] r- 3 } 



the signal-to-noise ratio for detection in the absence of the colored component, where- 
upon |3 = (A. 

2.2.4 The irrelevance Proof 



An alternative derivation of the sufficient statistic for detecting the signal s(t) in 
( inussian noise having autocovariance function $(t, s) is carried out in the frame- 
work of the reproducing-kernel Hilbert space (RKHS) introduced in Sec. 2.1.7. We 
saw there that from an arbitrary set of linearly independent functions g k (t) we can 
form a set of functions ./HO that are orthonormal in the sense of (2-46), and that 
samples v' k of the input v(t) generated from these as in (2-47) are uneorrelated and 
hence, being Gaussian random variables, are statistically independent. These new 
functions fk(() are not necessarily the eigenfunctions of (2-35). 

If we now take the first member of our set as gi(t) - s(t), the first member of 
i lie set of orthonormal functions will be 

Mt) = d- ] s(t), 

which is normalized in accordance with (2-46) by virtue of the definition of d in 
(2-67). Furthermore, E{v' k \ Ho) = 0, and by (2-47), with F k (t) the cofunction of 
A(/) as defined in (2-43) and (2-44), 

E(v' k \Hti= [ T F k {t)E[v{t)\H x ]dt = f F k (t)s(t)dt 
Jo Jo 

- «>•<«={'* IVi 

The samples v' k , k > 2, therefore have the same probability density functions under 
both hypotheses. When we form the likelihood ratio 
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Po{v'uv' 2i ...,v> n ) l k= \po(v' k ) 

all the factors will cancel except that for v\. This will be the case however large the 
number n of samples may be. The sample v[ is therefore a sufficient statistic. Because 
the cofunction F\{t) equals d~ l q(t) by (2-44) and (2-66), v[ = d~ ] g is proportional 
to the statistic g defined in (2-62), which is therefore also a sufficient statistic. 

What we have done here has been to divide the input v(t) into two statistically 
independent parts, 

v(t) = xt{t) + v"(t\ 
v>(t) = d~ 2 s{t)g, 

v"(t) = f v' k f k {t), 

k=2 

in such a way that the distributions of v"(t) are identical under both hypotheses 
#o and H\ , and the random process v"( ■ ) is uncorrected with i/( ■ ) and hence 
statistically independent of it. Thus v'{t) contains all the information in the input 
relevant to making the decision between the hypotheses in the optimum fashion. 
The component v"{t) is irrelevant, and the foregoing argument is often called the 
irrelevance proof that g is the optimum detection statistic. It is due to Kailath 
[Kai67], who developed the connection with reproducing-kernel Hilbert spaces in a 
later paper [Kai71]. The latter presents some examples of RKHS norms, and further 
methods of calculating such norms are outlined in [Kai72]. The signal-to-noise ratio 
d 1 specified by (2-67) is the RKHS norm of our signal s(t) as defined in (2-45), and 
by working only with functions having a finite norm, this approach circumvents the 
question of singular detection. 

2.2.5 Discrete-time Processing 

In some situations it is convenient to sample the input v(t) at n times tu h, ■■■ , 
uniformly separated throughout the observation interval (0, T), creating a set of 
samples - *>(**) tna * cati be collected into a row vector 

v T = (v u v 2 , ... ,u«). 

Preliminary filtering will have removed noise of frequencies outside the spectral band 
of the signal s(t) to be detected. The samples are characterized by a covariance 
matrix 

<!> = E(vv T \ H Q ) 

whose elements are 

4yt = Cov(vj, v k ) = E[v{tj)v{tk)\ Ho] 

as in Sec. 2. 1 . 1 . Their expected values are zero under hypothesis H ; under hypothesis 
Hi they are temporal samples of the signal, 

E(v k \ Hi) = s(t k ) - s k . 
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The joint probability density function of the samples under hypothesis H$ has 
i he form in (2-5) 

p (v) = (2ir)-" /2 i det <j>r !/2 Qxpi-^^vl 
- 1 1 ict under hypothesis H\ it is 

Px (v) = (2tt)-" /2 \ det <t>l" ,/2 expf-^ 7 " - s T )^~\v - s)], 

where s r = (s\ t sj, ... , s„) is the row vector of the samples of the signal. 

The optimum processing of these samples requires forming their likelihood 

ratio 

m = ^ = exply^v - \{v T - s T )4>-\(v - s)l 
- expjzj 7 <|>~ l s - is r <t> -1 sj, 
;ind we see that a sufficient statistic is the linear combination 

n 

S = vT( i ~ X 

where q = <j>~ l s. The coefficients qk are the solutions of the linear equations 

» 

s = cfjq (or) = £ foyty, 

;md they can be determined in advance. The statistic g can be evaluated by a digital 
computer or analogous device when the sampling intervals T/n are sufficiently long. 
1 1 is compared with a decision level go, and as before, the decision is for hypothesis 
U\ when g > go and for Hq when g < go. 
The expected values of the statistic g are 

E(g\ Ho) = 0, E(g\ Hi) = s r q = d\ 

under the two hypotheses, and its variance is 

Vars = E(g 2 \ Ho) = £(q Vq| Hq) 
= q 7 <j>q = s r q ~ d 1 

under both. The false-alarm and detection probabilities are again given by (2-72), 
and the governing signal-to-noise ratio is 

ft 

k = l 

w hich is the discrete-time counterpart of (2-67). 
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2-3 SOLUTION OF THE INTEGRAL EQUATIONS 



2.3.1 Inhomogeneous Equations 

To determine the impulse response of the optimum filter for detecting a known sign;il 
s(t) in stationary Gaussian noise of autocovariance function <|>(t), one must solve 
an integral equation of the form 

s(t) = $(t - u)q{u) du, < r < T, (2-91 j 

for the unknown function q(t). This type of equation, which is called a Fredholm 
integral equation of the first kind, occurs frequently in detection theory and in the 
theory of linear prediction and filtering. A continuous solution q{t) does not in 
general exist for continuous 5(0 unless the kernel $(t - «) has some singularity or 
the range of integration is unbounded [Cou31, vol. 1, p. 135]. For certain types of 
kernel a solution in closed form can be obtained, but it involves delta functions ami 
their derivatives. One's first thought is to treat (2-91) numerically, using a quadrature 
formula to replace the integral by a summation and solving the resulting set of linear 
simultaneous equations for the values of q(t) at a finite set of points in the interval 
< t < T, but a solution involving delta-function singularities can hardly be well 
approximated in this way. 

The situation is more favorable when, as is indeed usually the case, the noise 
contains a white component that has a flat spectral density, that is, when the auto- 
covariance of the noise is of the form 

<Kt) = y&(-r) + tt(t), 
where tt(t) is continuous, positive definite, and integrable. Then (2-91) becomes 

AT fT 

s(0 = jq(t) + J q -n(t - u)q{u) du } < t < T; (2-92) 

this is a Fredholm integral equation of the second kind, and a solution will generally 
exist unless {-N/2) is an eigenvalue of the integral equation 

V(0= \ 7 <t -u)f(u) du 

[Cou31, vol. 1, Ch. 3]. Because it(* - u) is ordinarily a positive-definite kernel 
representing an additive colored noise component, this integral equation cannot 
have a negative eigenvalue, and there is no trouble about the existence of a solution 
that is free of singularities. If the amount of colored noise is small, a solution of 
(2-92) can be obtained by iteration, 

2 T 2 f T 

g(t) = s(t) - <rr(f - u)s(u) du + - 

Otherwise one can calculate a solution numerically by replacing the integral by ;i 
summation and solving the resulting simultaneous equations for the values of q(i) 




60 



Detection of a Known Signal Chap. 2 



at a finite set of points in (0, T). See [Bak77, Ch. 4] for an exposition of these 
numerical methods. 

A type of autocovariance function <J>(/ - s) for which the integral equation 
(2-91) can be solved explicitly is that characterizing stationary noise whose spectral 
density <I>(w) is a rational function of the angular frequency w. Noise of this kind, 
can be generated by passing white noise through a linear filter composed of linear 
circuit elements — resistors, inductors, and capacitors. The transfer function y(u>) of 
such a filter is a rational function of co, and the spectral density 4>(to) of the noise 
output, proportional to b'(co)| 2 , is a rational function of to 2 , which can be written in 
the form 

TV-i (" 2 + a/) 

(" 2 + ml) 

with m < n. The constants fy and are either real or, if complex, occur in complex- 
conjugate pairs. We call noise of this kind leucogenic 3 because it can be thought of as 
:i rising from white noise through linear filtering. The autocovariance function of the 
noise is the sum of exponential functions of |t|, possibly multiplied by polynomials. 
Terms corresponding to complex m^s are bilaterally damped sinusoids. 
The spectral density $>(<o) can be written in the form 

where 

N(x) = p m x m + p^-ix'"" 1 + - + -M + p 
is ;i polynomial of degree m and 

P(x) = a„x" + a,,-!*"" 1 + — + aiJf + ao 

is a polynomial of degree n\ m < n. Then the autocovaj'iance function <fy(t — u) is 
:t solution of the linear differential equation 



P(-D 2 )$(t - u) = N(-D z )b(t - «), D = ~, 



d 2 

D 2 ~ -T7T, -CO < / < CO, 

dt 2 



*' (2-95) 



in which the differential operators on each side are obtained by replacing x by -D 2 
every where: 

d 2 " d 2i "~ l) d 2 

Jim j2(*«-n //2 

/V(-Z> 2 ) = (rVr^m-gs + Hr-'P-.jsfqy + - - Pi^i + fc- 

l lie boundary conditions on (2-95) are that (j)(/ - w) go to zero as t goes to co 
.mil -co. Indeed, taking the Fourier transform of (2-95), with (2-7), yields (2-94) 
immediately. 

'( i reck XeuKo? — white, *yevos = lineage, family, descent. 
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Let us now operate on both sides of (2-91) with the operator P(-D 2 ). By 
virtue of (2-95) we obtain 

P(-D 2 )s(t) = P(-D 2 ) f <K* - u)q{u) du = N(-D 2 ) [ T ?>(t - u)q{u) du, 
Jo Jo 

or 

P(-D 2 )s(t) = N(-D 2 )q(t), < t < T. (2-96) 

This is an inhomogeneous linear differential equation of degree 2m for q{t). lis 
solution involves the sum of a solution of the homogeneous differential equation 

N(-D 2 )q x {t) = (2-97) 

and a particular solution g (t) of (2-96). One particular solution can be found by 
taking the Fourier transform of (2-96) and is the same as q K {t) defined by (2-78) and 
(2-79). This may not be the simplest particular solution, especially if the signal s(t) 
is undefined outside the interval (0, T). It may be more convenient to solve (2-96) by 
Laplace transformation, and sometimes a particular solution can be written down 
by inspection. The solution of (2-97) will have the form 

m 

q\{i) = £ \c k exp(-h k t) + d k exp[-h k {T - t)]}, (2-98) 

A=l 

where h k and -h k , 1 < k < m, are the 2m roots of the algebraic equation 

N(-~h 2 k ) = 0; (2-99) 

as in (2-93) the zeros of the spectral density <P(a>) are ±ih k , 1 < k <m. 

Additional terms must be included in the solution of (2-91) to account for the 
finite end points and T of the range of integration. These have been shown to 
involve delta functions and their derivatives situated at those points, but standing 
just inside the interval (0, T). The complete solution of (2-91) then has the form 

n— m~ I 

q{t) = go(t) + J E«y8 W) (0 + bj* U) (t ~ T)) 

7=0 

(2-100) 

+ £{c* expC-AjtO + d k exp[~h k (T - 0]}, 

k=\ 

in which the a's, b% c\ and d's are constants to be determined [Zad50], [Zad52]; 
qo(t) is q M {t) of (2-78) and (2-79) or any other convenient particular solution of 
(2-96). Here 8<^(/ - u) is the j th derivative of the delta function, defined by 



[ 

Ja 



'fmw(t-u)dt = (-iy£jf(t) 



= H) y / (y) («), a<u<b, 



the superscript (j) indicating y-fold differentiation. When m = «, the terms with 
the delta functions do not appear in (2-100); when m = 0, the exponential functions 
are absent. 
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The detection statistic g is obtained by substituting q(t) into (2-65): 



•T 

S - ' 



q Q {t)v{t)dt + £ (-1)%I^ S (0) + bjV^(T)} 
{c k exp{-h k t) + d k cxp[-h k (T - !)]}v(l) dr. 



The terms in the first summation require the input v{t) to be differentiated at most 
n - m — 1 times and sampled at / = and t = T. The noise in the input can 
indeed be differentiated as many as n — in — 1 times because its spectral density 
(p(o>) decreases at infinity like |(i)|~ 2( "~" I) , and the variance 



Var» ( "-"'-' ! (0 = 



2tt 



—CO 



of the (n - m - l)th derivative of the noise n{t) is finite. The remaining terms in 
(2-101) can be generated by passing the input through a filter matched to the signal 

m 

4°(0 + X f c * exp(-/j A -0 + d k exp[-/j t (T - 0]} 

k-=) 

and sampling the output at the end of the observation interval. Although the solution 
of the integral equation contains delta functions and their derivatives, no singularities 
appear in the expression for the sufficient statistic g. 

The solution in (2-100) involves 2n constants, n — m each of the a's and b's, and 
m each of the c's and d's. To find them, one can substitute q(t) from that equation 
into the integral equation (2-96). After the integration is carried out, one will be 
able to cancel s(t\ and there will remain In distinct functions of t, each multiplied 
by some linear combination of the unknown constants. If the In poles ±im k of <£(to) 
are distinct, these functions will be exp m k t and exp(-mi-0, A: = 1, 2, ... , n. The 
coefficients of each of these functions must vanish in order for the integral equation 
to be satisfied, and one obtains in this way 2n linear equations that can be solved 
for the unknown constants in (2-100). 

Example 2-3 Lorentz spectral density: the inhomogeneous equation 

As a simple example, let the autocovariance of the noise be 

4>(t) = fae-M. (2-102) 
The spectral density then has the "Lorentz" form, 

*M = 4t^> (2-103) 

to- + |X" 

and in (2-93) m = 0, n = I, C = 2jx4> 0? and m : = p.. The particular solution <j (0 is 
now, by (2-78) and (2-79) and with S(to) the spectrum of the signal s(t), 



<7o(0 - qJyi) = (2H>or 1 J (" 2 + H- 2 )S{u>) e' 



.2 ..2^cv.A „/«f^ 

2-tt 

= (2M>or'[^(/)~.v"(/)], 
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where the primes indicate differentiation with respect to t. The solution of the integral 
equation will be 

q{t) = (2i«torVj(r) - s"{t)] + a B(t) + b b(t - T) (2-HM . 

as in (2-100). When this is substituted into the integral equation (2-91), we obtain 
C T 

s(0 = <|>o e-ti'- u[ q{u) du = ^[a^' + b Q e~^ ) ] 
Jo 

+ (21*)"' jV'V*(") - *»] du + (2^)"'^' | V"V*(«) " *"<«)] c/« 
Integrating the terms containing s"(u) twice by parts, we finally get 
s(t) = <$>ota e-»' + boe-^] 

- WV^'MO) - s'(0)] + e-^ T -%s(T) + s'(T)} - 2|u(f||. 

In order for this equation to hold for all values oft in the interval (0, T), the coefficient 
of and must each vanish, and the coefficients a and b Q must be given by 

2n.<Po 2|X(f> 
The detection statistic g now becomes 

g = (2u4x>r l |[M0) - s'(0)J tf(0) + + s'(T)] v(T)\ 

Jo 

The signal-to-noise ratio d 1 that determines the probability of detection is obtained b\ 
putting s(t) for v(t) in g, and after an integration by parts we find 

dl = Wo { [m]2 + ls(T)]2 + ilo^ 2[s(t)]2 + [s>(t)]2 l dt \ <2 * 107! 

This will be finite provided the signal is differentiable at least once within the interval 
(0, T). As defined in (2-43) and (2-44), the RKHS scalar product between two func- 
tions k(t) and m(0, each differentiable at least once in (0, T), can be written down b\ 
replacing s(t) by h(t) and v(t) by m{t) in (2-106) and then integrating by parts: 

<h, m) = (2<b r' [A(0)m(0) + h(T)m{T)] + ~ f V^OMO + h'{t)m\t)] dt. 

If the rational spectral density O(u) is any more complicated than the one in 
this example, the method of substituting q(t) of (2-100) into the integral equation 
(2-96) will be extremely tedious. Formal schemes requiring less labor have been 
developed by Slepian and Kadota [Sle69] and by Baggeroer [Bag69], [Bag71]; set 
also Appendix A. 

If <$>(t, s) is the autocovariance function of nonstationary leucogenic noise, tk 
integral equation (2-66) can be solved by a natural extension of the method of thi> 
section; it is described by Laning and Battin [Lan56, Sec. 8.5] and by Miller am- 
Zadeh [MU56]. If this kind of noise contains a component that is white, so that tlx 
kernel of (2-66) has the form 

W,s) = R(tM - s) + ir(t,s), 
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the integral equation becomes 

■T 



s{l) = R(t)q(t) + f -n(t,s)q{s)ds. 
Jo 



If the function tt(/, s) is the autocovariance of leucogenic noise y(i) that can be 
modeled as the output of a linear system characterized by a finite number of state 
variables and driven by white noise, this equation can be solved by a method due 
to Baggeroer [Bag69], [Bag71], One must be able to construct a linear system that 
depends on a finite number of state variables and is driven by white noise and whose 
output is the stochastic process y(t). We shall examine a method of this kind in 
Sec. 11.4. 

Kailath [Kai66] has shown how to solve the integral equation (2-66) for a kernel 
of the form „ ^ , v n ^ 

lf(s)g(t), < .v < t < T, 
where g(i) and f(t) are continuous functions of t and their quotient f(t)/g(t) is 
continuous and strictly increasing in the interval (0, T). Equations with similar 
kernels were treated by Shinbrot [Shi57]. Kailath [Kai66b] has also solved the integral 
equation for a triangular kernel, <J)(/,.j) = 1 - U - s\, < \t ~ s\ < 1, <(>(/, s) = 0, 
|/ - .vl > 1, and for linear combinations of the triangular kernel and a kernel of the 
type of (2-108). 

2.3.2 Homogeneous Equations 

When (he kernel 4>(/ — s) of the homogeneous integral equation 

rT 



W-s)f(s)ds (2-109) 



is the auiocovariance function of noise with a rational spectral density as in (2-93), 
the process of solving it is much like the method of Sec. 2.3.1. The solution will in 
general be a combination of exponential functions 

n 

fit) = X fe* ex PP*' + e k exp(-i?A-0]- (2-110) 
k-\ 

When (Ins is substituted into (2-109), it is found that the n numbers pk must be 
solutions of the algebraic equation 

A'{-/; 2 ) - \P(-p 2 ) = 0, p^±Pk, A- - 1, 2, ... , n, (2-111) 

where N(uv) and P(oi 2 ) are the polynomials in the numerator and the denominator 
of the sped nil density 3>(w) as in (2-94). 

Al the same time, certain linear combinations of the functions expm*-/ and 
exp(-/"/, / ) appear, where imk, —inik are the In roots of the equation P(or) = 0, k = 

1,2 n . These linear combinations must vanish in order for the integral equation 

to be saiisfieci, and in this way one obtains 2n homogeneous linear equations for 
the coefficients g k and e k , k = 1, 2, ... , n. These linear equations have a nonzero 
solution only when the parameter \ is one of the eigenvalues of the integral equation 
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(2-109), and the vanishing of the determinant of the coefficients of the 2n linen r 
equations provides a transcendental equation for the eigenvalues. General formulas 
that abbreviate the labor have been given by Youla [You57] and by Slepian and 
Kadota [SIe69]; see also Appendix A. 

Example 2-4 Lorentz spectral density: the homogeneous equation 

For the exponential kernel in (2-102) we can use the solution in (2-104) and (2-105) h\ 
putting sit) = \f{t): 

fit) = 2^ ^ /<0) • /,(0 ^ ) + [(A/(r) + ' r > + M- 2 /(0 -/"(>)} . 

It is apparent from (2-109) that the solution /(/) cannot contain any delta functions. 
The coefficients of B(/) and h(i ~ T) must therefore vanish, and we obtain the boundary 
conditions 

V-f(0) -/'(0) =0, 

^mw(r) = o, 

on the differential equation 

AO + r 2 /(o = o, r 2 = 2 |W fro\-' - [x 2 . (2-ii2) 

The solution of this differential equation is 

/(() = A cosTf + B sinH, 

and the boundary conditions give two equations for the coefficients A and B: 

\x,A-TB = 0, 
\l{A cos rr + B sin TT) + T(B cos TT - A sin TT) = 0. 

In order for a nonzero solution of these equations to exist, the determinant of tlu 
coefficients of A and B must vanish: 

* " r =0 

(x cos rr - r sin rr ^ sin rr + r cos rr 

This equation determines the values of V and hence, from (2-1 12), of the eigenvalues \. 
If we number the eigenfunctions starting from k = 0, we obtain from the determinant 

(m 2 - gl) tan g k + 2mg k =0, g k = T k T, m~ fiT. 

By substituting g k = m cot 6*, we find tan g k = tan 28*, whereupon 

m cot 6a- = 26* + &tr, k = 0, 1, 2, ... . (2-113) 

This can easily be solved by Newton's method. The angles 9* lie between and ir/2. 
decreasing toward with increasing k. From (2-1 12) we then obtain 

\ k — ~ sin ^ e A . 

gt +m 2 m 
It can be shown that for m = u,r <k 1 , 

»-V5r(i-£), *. « *,r (i - 2f). 
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Figure 2-6. Eigenvalues of the kernel <}>(' - ») = 4>o e 



and for m = |iT » max (1, Arir), 




24)0 r 



m 



The first five eigenvalues of the integral equation are plotted versus the parameter \xT 
in Fig. 2-6. The eigenfunctions f k (t) are proportional to cosI*(/ - \T) for k even 
and to sinr A .(/ - \T) for A: odd. The number of oscillations of the eigenf unction 
in the interval (0, T) increases with the index k as the eigenvalue X/, decreases. The 
oi yon function (t) in this example has exactly k zeros in the interval (0, T). 

11" the kernel is not of the exponential type studied here, the homogeneous 
inaiion (2-35) may be very difficult to solve exactly. Many approximation tech- 
1. 1 1 k*s have been developed for calculating the eigenfunctions and eigenvalues of 
.-..1 live-definite linear operators, especially in connection with problems in quan- 
ini mechanics. Methods and references to the literature are given by Morse and 
■ .hhuch [Mor53, Ch. 9] and Baker [Bak77, Ch. 3]. A convenient one, known as 
i.- Rayh'igh-Ritz method, is based on the fact that among all functions g(t) that are 
. >i iinili/.ed in the sense that 




T 



(2-114) 



ii- quantity 




(2-115) 
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is stationary when g(t) is an eigenfunction of (2-35), and the value it then takes on 
equals the corresponding eigenvalue \. If we use for g(t) a linear combination of a 
finite number of functions with arbitrary coefficients, 



g{t) = £ c k h k (t), 



(2-IKo 



substitute into (2-115), and vary the coefficients c k , taking account of the constrain i 
(2-114), until a stationary value of R is found, we determine a set of homogeneom 
linear simultaneous equations for the coefficients c k : 



In order for these equations to have a solution other than c k s 0, the determinani 
of the coefficients must vanish: 



Computer programs exist for finding the roots X of this determinantal equation and 
for calculating the associated coefficients c k . Each such root is an approximation 
to an eigenvalue of (2-35), and when the coefficients c k thus calculated are subsii- 
tuted into (2-1 16), the resulting function g(t) is an approximation to the associated 
eigenfunction. 

The homogeneous integral equation (2-109) has been solved by Slepian and 
Pollak [Sle61] for a kernel that is the autocovariance function of bandlimited whin- 
noise, 



The eigenfunctions are angular prolate spheroidal functions, and they possess flu- 
unusual ^ property of being orthogonal over both the finite interval (0, T) and ilic 
infinite interval (-«>, oo). These solutions have provided the basis of an extensive 
treatment of the uncertainty relation for signals [Lan61], [Lan62]. 

If the noise whose autocovariance function $(t, s) is the kernel of the homo 
geneous equation (2-35) can be modeled as the output of a linear system driven bv 
white noise, the eigenvalues X can be calculated numerically by a method given by 
Baggeroer [Bag69]. 



2.4 PHYSICAL INTERPRETATION OF THE 
S1GNAL-TO-NOISE RATIO 

In Sec. 2.2 we studied the detection of a signal s{t) in white Gaussian noise n(i) 
and there we learned that the signal-to-noise ratio d 2 determining the probability <>! 
detection is equal to 2E/N, where N is the unilateral spectral density of the noise 
«(0 and 



n 



J (m jk ~ kh jk )c k = 0, 1 <y < n, 



k=l 




det (m jk - kh Jk ) = 0. 
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E = 



'ut)f dt. 

o 



Ui- t ailed E the energy of the signal s(t). 

The signal s{t) is the voltage induced across the terminals of the receiving 
-inii-iiiia by the electromagnetic pulse launched by a distant transmitter. The noise 
1/ ) is a random voltage induced across those terminals by the fluctuations of the 
I'Viromagnetic field in the neighborhood of the antenna, and these are caused by 
■ l ii ■ i mal agitation of ions and electrons everywhere in the universe. We now want to 
Imw i hat this signal-to-noise ratio d 2 is equal to 2e/kT Q , where e is the maximum 
i'lt\sical energy that the receiver can extract from the electromagnetic field carrying 
ill*- signal, A: = 1.38 ■ 10~ 23 joule/deg is Boltzmann's constant, and Tq is the effective 
ii'.i'ltile temperature of the fluctuating field. By effective temperature we mean that 
Mi.- -.pectraJ density of the noise n(t) is the same as though antenna, receiver, and 
ill were enclosed in an enormous box that is in thermal equilibrium at absolute 
!■ nipciature Tq. Physicists call such an enclosure a Hohlraum. 



2.4. 1 The Transmission-line Model 



i Ik- simplest way to show this is based on the same model that Nyquist used to derive 

i Ik- spectral density of the fluctuating voltage across a resistor [Nyq28]. We imagine 
iii.ii a very long transmission line is connected to the terminals of the antenna and 
ili.il everything is enclosed in a Hohlraum in thermal equilibrium at absolute temper- 
iihk- '/;,. The transmission line is matched to the antenna so that at all frequencies 

ii 'Airaels the maximum power from the ambient field. The electromagnetic field of 
iih- hue is decomposed into normal modes, and the amplitude of the y'th mode is 
I' tidied by Vj. This modal coefficient Vj is so scaled that Vj is the energy in the y th 
ui' u I e . 

Under hypothesis Hq vj, as a linear functional of the Gaussian noise process 
i.' ). is a Gaussian random variable with zero expected value and a variance Vart?/ 
>|iial lo the average thermal energy in they'th mode. Each Vj constitutes a degree of 
ihvdom of the system. According to the theorem of equipartition of energy, each 
l -"lee of freedom must hold an average energy 

a 2 = Var Vj = \lcTo 

■ li.-n (he entire system is in thermal equilibrium at absolute temperature Tq. The 

N.nlal coefficients vj are statistically independent. 

Let us suppose that the signal s(t) has arrived, and let us measure the modal 

:mp]itudes Vj at such a time that the entire signal field is on its way down the 
nr. mission line, but before it has reached the end of the line. The line is assumed 

•■Mr enough so that this will be possible. Then under hypothesis H\ the expected 
ilue o\'Vj will equal the amplitude s f of the y'th mode of the field in the line created 

■. i lie signal. We base our decision about the presence or absence of the signal on 
i. nbserved values of all the modal amplitudes Vj. Under the Ney man-Pear son 
iiu-non the optimum way to process these amplitudes. is to form their likelihood 
iii.i and compare it with a decision level X set to yield a prcassigned false-alarm 
= .-hahiIiiy Qq, just as we did in Sec. 2.2. The entire analysis continues as in that 
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section, with the variances \ k in (2-51) through (2-63) replaced by a 2 . The optimum 
detection statistic is a linear combination of the modal amplitudes v j} weighted by 
the signal amplitudes sj. The false-alarm and detection probabilities are again given 
by (2-72), and by (2-63) the governing signal-to-noise ratio equals 




(2-117) 



because the available signal energy s equals the sum of the squares of the signal 
coefficients sj, and a 2 = \1cTq. This is what we set out to prove. 

One might object that if an average energy \kTo is associated with each variable 
Vj under hypothesis H , and if there are an infinite number of these, the total average 
energy in the field of the transmission line must be infinite. The resolution of this 
paradox rests on Max Planck's quantum hypothesis. First of all, we must assert thai 
the modal variables actually occur in pairs, which we denote by v) and v'J, and each 
pair is associated with a frequency vj that is an integral multiple of c/L, where <■ 
is the velocity of propagation of waves along the line and L is its length. By virtue 
of Maxwell's equations, the y'th such mode behaves like a harmonic- oscillator of 
frequency vj. We can think of v} and i/f as the real and imaginary parts of the 
complex amplitude 

(vj + ivj') exp 2irivjt 

of that oscillator or alternatively as related to the electric and magnetic components, 
respectively, of the field in the line. 

The total energy in the y'th modal oscillator is now, by our normalization, 

Wj = vf + vf, 

and because vj and v'j are Gaussian and independent, the probability density function 
of this energy under hypothesis Ho is 

Po(Wj) = (kToT 1 ex p("^)' 

which is known as the Boltzmann distribution, 

Planck postulated that each field oscillator cannot take up an arbitrary energy 
Wj, but only a discrete amount that is an integral multiple of hvj\ h - 6,626 ■ 10" 3 ' 1 
joule-sec became known as Planck's constant. That energy is still, however, governed 
by a Boltzmann distribution, whereby for any nonnegative integer n the probability 
that the oscillator holds n such "quanta" is 

Pr(^ = nhvj\ H ) = C «p(-^), » = 0, 1, 2 

A simple calculation shows that because these probabilities must sum to 1, the con- 
stant C must be 

C = 1 - hj, wj = ex P (-||). 
The average number of quanta is then 
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CO 



( 



hvj 
kT 



) 



E(n\H Q ) = (\-wj)X 




- exp; 



I he average total energy in the y'th modal oscillator is therefore not kT Qi but 



! his approximately equals kT Q when vj « kT /h, but decreases exponentially with 
,■ when vj » kTjh. (For T Q = 17°C = 290 K, kTjh - 6 ■ 10 i2 Hz, a frequency 
i.ii' higher than those encountered in radio communications or in radar.) The total 
.lUTiige thermal energy in the field of the transmission line, which is the sum overy 
■ .1 /•;( Wj | Ho), is therefore finite. 

It would be incorrect, however, to replace kTo in (2-117) by the value of 
/ < Hq) just quoted. If the frequencies in the signal s(() arc so high that the 
.iu-i;ige thermal energy in each mode it occupies must be so expressed, ordinary 
■mistical decision theory is inapplicable, and the decision about the presence of the 
.n'iKil must be treated in a manner consistent with the laws of quantum mechanics. 
I lnw lo do this has been described elsewhere [HeI76], [Hol82] and lies beyond the 
.< opc of this book. The effects of the quantum nature of signals at frequencies larger 
[h.in kT Q /h will show up, however, when we treat the detection of optical signals in 
' h;ipier 12. 

2.a.2 The Circuit Model 

\d ;ihcrnative derivation of (2-117), based on an analysis of a circuit model of the 
i.vaver, can be instructive. The signal is created in the receiver when the pulsed 

kviromagnetic wave from the transmitter excites the antenna and causes a current 
K' How through an input load impedance. The voltage across this impedance Z L , as 

Ihnvn in Fig. 2-7, is the v(l) that we have taken as the input to our receiver in the 
pi. 'moils sections. The signal component of this voltage, when present, is s(t). In 
i lir. model the noise component n(t) arises from the random externa! electromagnetic 
i h iils exciting the antenna and from thermal agitation of the constituents of the load 
iiii|vdance Z/,. We should like to interpret the maximum attainable signal-to-noise 
i.iiio r/£ in (2-80) in terms of the total energy e delivered by the transmitter to the 
: ■• river and the effective absolute temperature 7q of the perturbing thermal noise. 

The antenna is replaced by its Thevenin equivalent circuit, consisting as shown 
.11 I iu. 2-7 of an ideal voltage generator in scries with the impedance Z A of the 
uih'ima. Its open-circuit voltage, induced by the incident signal field, equals 
i !u' wiliagc generator produces a pulsed voltage that induces a current in the circuit 

■ ii f>oscd of Z L and Z A in series, and the voltage thus created across the load 
:ii|n-d;ince Z L is the signal s(t). Denote the Fourier transforms of sq(i) and s(t) by 

■ and 5(co), respectively. Then 

5(a)) - 7 Sq(co). (2-118) 

Zl + Z A 

When the region of space toward which the beam of the antenna is pointed has 
:i > llcciivc absolute temperature To, the antenna appears, according to Nyquist's 
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J. 



Figure 2-7. Equivalent circuit for the 
input to a receiver. 



law, to have in series with its equivalent impedance Z A an ideal generator of ;i 
random voltage n A {t) with spectral density 

0>o(w) = 2kT Re Z A ~ 2kT R A . 

Here k is again Boltzmann's constant and R A is the resistive component of tin- 
impedance Z A , which is largely the radiation resistance of the antenna [Hel9l. 
pp. 418-31], [Pap91,pp. 351-4]. 

The load, at absolute temperature T L , likewise seems to have in series with ii 
an ideal generator of a random voltage n L {t) with spectral density 

<M«) = 2kT L Re Z L = 2kT L R L , 

The net spectral density of the noise voltage n(t) observed between the terminals of 
the load is, by (2-9), 



0>(to) = D (t») \y A ^o(a))\ 2 + *l(w) bz.- (w)| 2 , 



where as in (2-118) 



Z L 



Z L + Z A 

is the transfer function to the output from an ideal generator in series with Z A , and 
where 

Zl + z A 

is the transfer function to the output from an ideal generator in series with Z L . 
Hence the spectral density of the noise in the output is 

2kT R A \Z L \ 2 + 2kT L R L \Z A \ 2 



*(«) = 
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The maximum signal-to-noise ratio at the output with respect to detection of the 
signal 5(0 is then, by (2-80), 

;2 __ , is(«)i 2 rf« _ r \s (io)\ 2 \z L \ 2 d«> 



di = 



*(&)) 2ir 



-« 2kT R A \Z L \ 2 + 2kT L R L \Z A \ 2 2tt 

Diminishing the temperature T L of the load causes d 2 to increase, and we see that 
it is bounded above by 

The smaller the term kT L R L \Z A \ 2 is in comparison with the term kT R A \Z L \ 2 , the 
more closely equality is approached. We now endeavor to express this maximum 
possible output signai-to-noise ratio in terms of the signal field striking the antenna. 

The integral in (2-1 19) must be proportional to the energy density in the elec- 
tromagnetic field of the signal as it passes the antenna, and that energy density is 
conveniently measured by the energy e that would be absorbed from the signal field 
if the load were matched to the antenna [Hel91, pp. 429-31]. To match the antenna, 
the load is given an impedance Z L = Z^* = R A - iX A , and the absorbed energy is 
what is then dissipated in the load: 

\S(v)\ 2 da _ r Zt \S Q (u)\ 2 d<* __ r \Sq(<*)\ 2 du 



J_ M Z L 2tt J_«, \Z l + Z A \ 2 2<n J_ 



4Ra 2ir 



dl < df = ^r, (2-120) 



Under the reasonable assumption that the effective temperature T is uniform over 
the band of frequencies occupied by the signal, we can therefore write (2-1 19) as 

2e 
kTo' 

where e is the maximum energy that the receiver can draw from the incident signal 
field. This energy must have been supplied by the transmitter. 

The energy e absorbed by a matched load is equal to E\A r> where E\ is the 
total energy in the electromagnetic field of the signal passing through a unit area 
normal to the direction of propagation, and A r is the effective area of the antenna 
for waves moving in that direction. In terms of the gain F of the antenna in that 
same direction, A,- =- rx 2 /4ir, where X is the wavelength of the signal radiation. If 
the dimensions of a plane antenna are much greater than a wavelength, as shown by 
antenna theory [SH49], the maximum value of the effective area A r is equal to the 
geometrical area of the antenna. The energy e is then the total energy in the signal 
field that passes through an area equal to that of the antenna and perpendicular to 
the direction of propagation. In this sense we can call s the signal energy intercepted 
by the antenna, and (2-120) informs us that at best the output signal-to-noise ratio 
f/ 2 equals the intercepted signal energy divided by \kT Q . The quantity kTo is the 
maximum noise power per unit of frequency available from the surroundings when 
they are at absolute temperature T [Hel91, p. 430]. Thus in the example following 
(2-72), the minimum detectable signal must furnish a total energy e of at least 25. IA:7o 
during the observation interval. (For T = ITC = 290 K, kT = 4 ■ I0" 21 joules.) 

In a practical receiver the input from the antenna must be amplified before 
it can be processed to form the detection statistic g for comparison with the deci- 
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sion level g . The signal and the noise are amplified together, and the amplifiers 
themselves add noise generated through the shot effect and other random electronic 
mechanisms. 

How an amplifier affects the signal-to-noise ratio is often measured by its nohr 
figure F t F > I. If G is the power gain of the amplifier, the noise power per unii 
bandwidth at the output of an ideal amplifier would be G times that at the inpui. 
Because the amplifier itself adds noise, the actual noise power density at its outpui 
is not G, but FG times that at the input. The effective signal-to-noise ratio at On- 
output is therefore 



in terms of the effective input signal-to-noise ratio df n = 2e/kT . 

It is shown in texts on radio and radar that for two networks in cascade havini' 
noise figures F\ and F 2 and power gains C?i and G 2 , in that order, the noise figmv 
of the combination is given by 



[Hel91 , p. 432]. Hence if the first amplifier has such a large gain that G\ » F 2 ~ 1 . 
its noise figure effectively determines that of the combination. Extension to a chain 
of many amplifiers is immediate. For further discussion of these matters, see [Dav5K 
Ch. 10] and [Hau58]. 

The signal s(t) whose detection has been the subject of this chapter has been 
assumed to be completely known in amplitude, form, and time of arrival. In rad;n 
practice, as often in communications, signal parameters may be known only within 
wide ranges, with a consequent decrease in the probability of detection or increase in 
the energy of the minimum detectable signal. Strategies for detecting more vaguch 
known signals will be discussed in later chapters. 



2.5 DECISIONS AMONG A NUMBER OF KNOWN SIGNALS 
2.5.1 Signal Space 

A communication system might be transmitting information coded in M symbol*, 
as described in Sec. 1.1. The y'th symbol causes the transmitter to emit a signal th;ii 
appears at the input to the receiver as sj(t). The receiver must decide which of the 
M signals $,(/) is present in its input, 

v{t) = Si{t) + n{t), < / < T, (Hi), i = 1, 2, ... , M, (2-121 ) 

during a particular observation interval; that is, it must choose one of the M hy- 
potheses Hi. The form of each of the signals is completely known. We assume 
that each signal is confined to (0, T) and that intersymbol interference is absem 
The ith signal is transmitted with relative frequency which is therefore the prior 
probability of hypothesis H i} and 
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M 

1 = 1. 

I he noise n{t) will be taken as while and Gaussian with unilateral spectral density 
\ The receiver is to incur minimum probability of error in its decisions. 

As in Sec. 2.1.3, we sample the input v(t) by means of a set of functions 
(/),/ = 1, 2, ... , that are orthonormal over the observation interval (0, T) in the 
isc of (2-1 1). Remembering the irrelevance argument in Sec. 2.2.4, we create that 
■i-i by applying the Gram Schmidt procedure of Sec. 2.1.4 to a set of functions 
■';(M, #2(0, ... , that begins with the signals s;(t) to be detected. When wc have 
niili/cd all the signals- -there are only M — , we must bring in other functions gj(t) 
t<> complete the data space. If the M signals s,(() are linearly independent, we can 
i.i kc #,-(/) = .v,(/), i = 1, 2, ... . M, and by the orthogonalizalion procedure we can 
i • Mm a set of M orthonormal functions /■(/) that are linear combinations of them. 
I he signals might, for instance, be orthogonal in the first place, 

C T 

Si (()s;(t) At = E,h ih (2-122) 

-0 

-a here E-, is the energy of the /th signal. Then 

f,{t) = E7 ]/2 Si{t) f 1 < i < M. 

If the signals .v,(0 are not linearly independent, however, we shall be able to 
ii-.c only some subset of D < M of them before we run out of usable functions gj(t) 
.iiul must seek elsewhere. The signals might, for instance, differ only in amplitude, 
i . in a pulse-amplitudc-modulated (PAM) communication system, 

SkiO = A k s(t) t * = 1, 2, ... , M, (2-123) 

u hereupon D - 1. As another example, consider a quaternary communication sys- 
i »-i ii in which the four received signals arc 

Ji(0 = 4/i(0. siiO = -Af x {t\ 

J3(0 = AMO, s*i0 = -Af 2 {t), 

v. here the functions J\(t) and/if/) are orthonormal as in (2-11). Only two of these 
I* Mir signals are linearly independent, and D = 2. 

l-'aeh signal can be written as ;i linear combination of the D orthonormal 
hiiK iions/i(r) formed by the Gram-Schmidt procedure, 

D rT 

*</) = Twit/), Ski = s k {t\f,{i)dt. (2-125) 
I lie /) components s^i form a vector 

i hat represents the signal. For amplitude-modulated signals as in (2-123), these 
uviors have a single component proportional to Au- For orthogonal signals of 
i-i]ual energy £", each vector has M components, of which one equals E 1 ^ 2 and the 
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S1 



s 2 



S3 



s 4 



S5 



Figure 2-8. Signal vectors, PAM com- 
munication system, M = 5. 















S 3 




S2 







si fi 






s 4 





Figure 2-9. Signal vectors, quaternary 
communication system. 



s 2 




Figure 2-10. Orthogonal signal vectors. 
M = 3, Ej s E. 



rest equal 0. The M signals st(t) are said to span a D -dimensional space, as do the 
D orthonormal functions f t (t) formed by combining them. This space is called the 
signal space. 

Figure 2-8 shows the signal vectors s, associated with the PAM signals in (2- 
123) when M = 5 and the signals have amplitudes A k = (k - 3)A, 1 < k < 5; s 3 = 
is a null vector. Figure 2-9 shows the two-dimensional signal space that embeds the 
four quaternary signals in (2-124). Figure 2-10 shows the three-dimensional signal 
space spanned by three orthogonal signals of equal energy. 

The remaining members /*(/)= k > D, of the set of sampling functions will be 
orthogonal to the first D functions ft(t), 1 < 1 < D, and hence to all M signals, 
which are linear combinations of the functions fj(t) for 1 <j < D. 

As in Sec. 2.2.1, the receiver is to base its choice among the M hypotheses Hi 
on the samples 

Xi = f Mt)v{t)dt. (2-126) 
Jo 

These data are uncorrected because, as in Sec. 2.1.5, the noise is white and the sam- 
pling functions /■(/) are orthonormal. As the data are Gaussian random variables, 
they are therefore statistically independent. They have expected values 
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E(xi\ Hj) = \ T fi(t)sj(i) dt = Sji , 1 < / < D, 
Jo 

= 0, / > D, 



when the jth signal is present, and their variances are equal, 

N 



Var ( Xi \Hi) = 



2 ' 



as can be verified by the techniques of Sec. 2.2. The joint probability density function 
of any number n > D of the samples has the multivariate Gaussian form 



pj"(x) = (-nNT' l/2 exp 



^X (A <-^ )2 exp "/V £ ( 2 - 12? ) 

i=l J L i=D + l 

under hypothesis that the jth signal is present. 

According to what we learned in Sec. l.l, the receiver will attain minimum 
probability of error if it chooses the hypothesis with the greatest posterior probability 
Pr(Hj\ x), given the set x = (x\, x 2 , ... , x D , ... , x„). From (1-4) we see that all it 

needs to do is to compare the M quantities t,jpf'\x) and choose the largest. The 
rightmost exponential factor in (2-127) is common to all these quantities and cancels 
out from the comparisons: the data x,- for / > D are irrelevant to the decision 
because they contain no information about which hypothesis is true. The D random 
variables Xj, 1 < i < D, which we gather into a data vector 

x = (x\, x 2 , ... , x D ), D < M, 

contain all the information in the input v(l) relevant to deciding among the M hy- 
potheses in the optimum fashion. The receiver bases its decision on the M quantities 
'CiPj(x\ where 



pj{x) - (vNT D/2 exp|-i|x - s ; -| 2 ], 



j = 1, 2, ... , M. 



(2-128) 



Here 



\X - S/| = (X - S;, X - Sj) ]/2 = ]T ( A ''' ~ S Ji) 2 



1/2 



is the distance in the signal space from the "data point" x = (x\, x 2 , ... , x D ) to the 
vertex of the jth signal vector Sj . The receiver picks that hypothesis for which ijpj (x) 
is largest. Equivalently, the receiver chooses that hypothesis Hj for which 



29) 



We use the same notation as in Sec. 2.1.4; 

D 



(s fc) x) = V s k iXi = s k (t)v(t) dt 



is the scalar product of the kth signal vector s k and the data vector x. 
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The D data x\, x 2 , ... , xd on which the decision is based can be generated by 
passing the input v{t) through a set of D parallel filters matched as in (2-17) to the 
D orthogonal "signals" f k (t), 1 < k < D, and measuring the output of each filter 
at the end t = T of the observation interval. For amplitude-modulated signals as in 
(2-123), we have a single datum x, the output of a filter matched to the signal s{t) 
to which all M of the signals are proportional. When the signals are orthogonal, 
there are M independent data xu x 2 , ... , xm- 

Let us designate by Rj the region in the D -dimensional signal space containing 
those data vectors x that lead to the choice of hypothesis H Jt "Signal sj(t) is present." 
From (2-129) we see that these decision regions are bounded by hyperplanes on which 
the scalar product ((s,- - s,), x) is constant, and each hyperplane is perpendicular to 
the vector sj - s,- connecting the vertices of some pair s,-, S/ of signal vectors. 

The probability q s of correctly choosing hypothesis Hj is the integral over 
Rj of the probability density function pj(\) of the data x u x 2 , ... , x D . This den- 
sity function has the multivariate Gaussian form in (2-128), and that probability is 
therefore 

qM=(^T D/2 jJ^~\x~ Sj \ 2 Jd D x, (2-130) 

where d D x = dx\.,.dxB is the volume element in the D -dimensional space. The 
probability of error is then 

M 

p e = 1 ~'Ltm- (2-ni) 

For an arbitrary configuration of signal vectors s,, evaluating the integral in (2-130) 
may be quite difficult. 

In many communication systems information is transmitted at maximum rate 
by coding it into M symbols in such a way that they occur equally often, & s AT' . 
Then the receiver chooses the signal sk(t) whose vector s k lies closest to the data 
vector x in the sense that the distance |x - | is smallest among all the distances 
lx-s/1, 1 < i < M. 

The probability of error is invariant to rigid rotation of the set of M signal 
vectors s k , for the surfaces bounding the decision regions Rj rotate with them, and 
the probability density functions move with them unchanged. This invariance is 
sometimes useful in calculating error probabilities. The probability of error depends 
only on the configuration of the M signal vectors s k in the signal space, and not 
on the particular orthonormal functions /,(*), i = 1, 2, ... , D, used to represent the 
signals through (2-125). Thus a great variety of sets of received signals may incur 
the same probability of error. 

If the noise is not white, but colored, possessing an autocovariance function 
<\>(t, s), a signal space of the same kind can be constructed as we have shown here if 
one uses instead the type of scalar product introduced in Sec. 2.1.7 and designated 
by angular brackets ( • , ■ )■ 
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2.5.2 Displacement of the Signal Vectors 



Suppose that a new set of signals ,v/(0 is generated by adding lo each signal Si(t) a 
common signal a(i) that lies in the same signal space and is represented there by a 
vector a as in (2-125). The new signals are represented by the vectors 



The decision rule for the receiver of these new signals sj(t) is the same as in 
(2-129), except that s,- is replaced by sj and s, by sj. Substituting into it from (2-132), 
we find that the hyperplanes separating the decision regions R- for the new receiver 
are obtained by displacing all the points in R ( by the same vector a. The decision 
regions Rj for the new set of signals arc simply the old regions R, moved rigidly 
along with the signal points s,- in the displacement (2-132). 

Although the energies of the new signals s/(/) differ from those of the original 
ones, the minimum probability P/, of error in deciding among them is equal to the 
minimum probability P L , of error in deciding among the original signals Indeed, 
the probability qjj that the new receiver correctly chooses signal sj(t) (hypothesis Hj) 
when Hj is true is, as in (2-130), 



Changing variables to x = x' - a, we find that this becomes identical with the ex- 
pression in (2-130) for the probability that the original receiver correctly chooses 
hypothesis Hj ("Signal sj(i) is present") when H } is true. The probability of error in 
the new receiver must therefore be the same as the probability P v of error (2-131) in 
the receiver of the original signals s,-(t). Because of this invariancc of the probability 
of error, when a new set of signals is formed by displacement from a set for which 
the probability of error is already known, the only problem remaining is to calculate 
the energies of the new signals. 

2.5.3 Orthogonal and Simplex Signals 

The simplest configuration of signals is one in which they are orthogonal, convey 
equal energies /: to the receiver, 



and are transmitted equally often. The signal vector s, whose vertex is closest to the 
data point x will be that for which they'th component xj is largest: 



AH probabilities qn of choosing hypothesis Hi when /-/, is true are now equal, 
and in calculating the average probability Q c of correct decision, wc can assume that 
the first signal a,(r) was actually the one received and that hypothesis H\ is true. 
Hypothesis H ] is correctly selected if x\ > Ay, V/ f 1; and because s u = £ l/2 8, 7 , 
the probability of correct decision is 



s,- + a, 



1 < ( < M. 



(2-132) 




(S;, Sy) = £5//, 



[xj > x it V," ±j) => ^Hj. 
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Qc = ?n = Pr(* ; <x u j = 2,3,... ,M\H } ) 

P\{X\)dXi M Po(Xj)dXj\ , 

where 

Mx) = exp (~i) Mx) = J Sr £ } 

Hence 

1 f 00 

Qc = J ~ d)2 ] (I " CTfc J*)"' 1 ^> ( 2 - 133 ) 

in terms of the error-function integral (1-11). Here d 2 = 2E/N is the signal-to-noise 
ratio for each signal. The error probabilities P e = 1 - Q c have been tabulated and 
plotted by Viterbi [Vit64] for values of M that are integral powers of 2. 

When M vectors s£ all lie in a hyperspace of dimension M - 1, all of them 
having equal lengths js*i = E n/1 and making equal angles with each other, they 
are said to form a regular simplex. For M = 2 the corresponding signals s' k (t) are 
antipodal. The tips of the vectors s' k form for M - 3 an equilateral triangle and for 
M - 4 a regular tetrahedron. For all M the signals add to zero: 

M M 

X */(') s °> X sj = 0. 

We shall show that the probability of correct decision is again given by (2-133), but 
with the signal-to-noise ratio d 2 replaced by the slightly larger value 

It has been conjectured that the regular simplex configuration attains the lowest error 
probability P e among all sets of signals having a given total energy. For M » 1 
this probability is only slightly less than that incurred by M orthogonal signals. 

In order to prove that (2-133) and (2-134) determine the probability of correct 
' decision for the simplex signals, we observe first that they are obtained from a set 
of M orthogonal signals s t (t) of equal energies E by adding to each the signal 

1 M 

7=1 

which is represented by the vector 

I M 

According to what we said in Sec. 2.5.2, when the receiver optimum for deciding 
among them is used, these new signals sl(t) = 57(f) + a(t) result in the same proba- 
bility P e of error as the original set of M orthogonal signals, for which P e is given 
by (2-133). The sum of the new signals is zero, and the reader can easily show that 
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they al! have equal energies and that I heir pair wise scalar products (s,-, s/), / ^ j\ arc 
all equal. The new signals have smaller energies 

£" = (sj, s{) = (s f , s,) + 2(a, si) + (a, a) 
= £ - 2M~ 1 £ + EM~ ] - E(\ - A-r 1 ) 

than the original orthogonal signals from which they were derived. The signal-lo- 
noise ratio for the original signals, which figures in (2-133), is therefore larger than 
that for the simplex signals by a factor (I - M" 1 )" 1 = M/(M - 1), whence (2-134). 

The problem of finding a set of signal vectors that yields low probability of 
error in a space of dimension D less than the number M of signals and the question 
of the optimality of the simplex signals are treated at length in the book by Weber 
[Web68]. The performance of systems involving other signal sets is analyzed in 
textbooks on communication theory, such as [Vil66] and [Pro89]. 

Problems 

In these problems the noise is Gaussian and the observation interval is (0, T) unless otherwise 
stated. 

2-1. Let \k be the A'th eigenvalue of the integral equation 

\f(t) = \\([, s)f(s)ds> < t < T. 
Jo 

Prove that 

V \ k = f V \l - f f Mt, «)4>(«, i) & tin- 

Hint: Use (2-42). 

2-2. A stationary Gaussian random process ,v(/) has expected value and a u toco variance 
function <K T )- The variance of the process is estimated by forming the quantity 

Z = j^WOfdt. 
Find the expected value of Z and show that 

VarZ = ~\\t-sM(s)] 2 <!s- 
» 'Jo 

Hint: Here you will need the rule 

£(a',a- 2 a- 3 a-4) = E{x\X 2 )E (.vj.v 4 ) + £(a-|A' 3 )£(.v 2 a^) + ii(.vi X| )£ 1x3X3) 

for Gaussian random variables with expected values zero [Hcl9 1 , p. 244], [Pap91 , p. 197]. 

2-3. Taking as your interval (-1, 1) instead of (0, T), use the Gram-Schmidt procedure 
in Sec. 2.1.4 to generate the first six Legendre polynomials Po(t), ... , from the 
powers /°, ... , / 5 . [The conventional normalization of these polynomials is P<, (1) = 1.] 

2-4. In [Kai71] Kailath defines the scalar product in the reproducing-kernel Hilbert space 
in a manner different from that in Sec. 2.1.7. Taking a random process n{t) with zero 
expected value and autocovariance function <}>(/, .?), one assigns a random variable «/, 
to a function h(t) in the interval (0, T) by the relation E[n(t)ui,] = hit)- To a function 
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m(t), the random variable «,„ is likewise assigned. The scalar product is then defined 
as 

{h, m) = E{u h u m ). 

Show that 




where H(t) is the cofunction of h{t) as in (2-44); u,„ is similarly specified. Then show 
that Kailath's definition of the scalar product is equivalent to that in (2-43). 
2-5. The optimum detector for a signal s(t) in white Gaussian noise has been constructed, 
but the signal that appears at its input is not s(t), but s x (t). Calculate the probability 
of detecting ^i(f) and show how it depends on the integral 

Jo 

2-6. The signal s(t) = A[\ - exp(-a?)] is to be detected in the presence of white Gaussian 
noise of unilateral spectral density N. Let the observation interval be < t < T. Find 
the impulse response of the proper matched filter, and work out the output of the 
matched filter as a function of time when the input is the signal s(t). Assume s(t) = 0, 
t > T. 

2-7. A system is to be designed to decide which of two signals, s (t) or si(t), has been 
received in the presence of white Gaussian noise of unilateral spectral density N. Show 
that the system can base its decision on the correlation of the input with the difference 
of the signals. Relate the decision level on the optimum statistic to- the critical value 
A of the likelihood ratio. 

2-8. In Problem 2-7, find the probabilities Q and Q f of each of the two kinds of errors. 
Show that for Q Q fixed, the probability Q\ depends on an effective signal-to-noise ratio 

' d2 = 2 (Ej +E Q - 2R) 
N 

where E Q and Ei are the energies of the two signals, and R is 

R = f s Q (t)s\(t)dt. 
Jo 

2-9. A signal 

J(l) = A cos(^) + B cos(^i) 

is to be detected in Gaussian noise «(/) with autocovariance function 

(Kf, s)~a cos(2tt//7') cos^irs/r) + 3 cos(4Tr//r) cos(4-ns/T), a > 0, p > 0, 

by observing the input v(t) to the receiver during the interval (0, T). That is, under 
hypothesis H v(t) = n(t); under //, v(t) = n(t) + s(t). Describe in detail the opti- 
mum detector under the Neyman-Pearson criterion and calculate the probability Q d of 
detection, showing how it depends on the false-alarm probability Q Q . 
2-10. A signal of the form 

s(t) = A cos wt, £ t < T, w = 2?, ■ 
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is to be detected in Gaussian noise having the autocovariance function 
A(/, it) = > LL k cos kwt cos kwu, w = — , 

where all |A fr are positive. The input v{t) to the receiver is observed during the interval 
(0, T). Determine the optimum detector of the signal s(t) under the Ney man-Pears on 
criterion, and calculate the false-alarm and detection probabilities Qo and Q,\. Express 
Q,i in terms of the energy of the signal s{l) and other parameters in the formulation of 
the problem. 

2-11. The signal s{t) ~ At exp (~bt)U{t) is received in the presence of noise of autocovari- 
ance function 4>(t) = <j>o exp(-jxM) as in (2-102). Show how the input v(i) to the 
receiver should be processed by a matched filter and a delay line to decide whether 
the signal is present. Calculate the effective signal-to-noise ratio d 2 , and state how the 
probability of detection depends on it. Hint: Use the results of Sec. 2.3. 

2-12. A signal s{t) is to be detected in a mixture of white noise of unilateral spectral density 
N and correlated noise whose autocovariance function is that given in (2-102). The 
spectral density of the noise is thus the same as that in Example 2-2. Show how to 
calculate the impulse response of the detection filter by the technique described in 
Sec. 2.3. As an example take the signal as a constant, .?(?) = A, and work out the 
impulse response of the filter and the effective signal-to-noise ratio d 1 . 

Show how the solution q(t) of (2-91) passes in the limit of vanishing white noise 
(N — * 0) to a solution involving delta functions as in (2-104). In Fig. 2-11 we have 
plotted q{i) for \lT - 2 and for various values of the parameter 4<j>o/|jJV. The peaks 
at t = and / = T become sharper and sharper as the spectral density of the white 
noise decreases. 

2-13. The noise at the input to a receiver is stationary and Gaussian and consists of the 
sum of white noise with unilateral spectral density N and noise whose autocovariance 
function is that given in (2-102). The receiver is to detect a signal of the form , 

s(t) ~ A e~" M , -oo < t < oo. 

An infinitely long observation interval is allowed. Find the optimum detection statistic 
under the Neyman-Pearson criterion and calculate its probability of detection for fixed 
false-alarm probability. 
2-14. (a) A rectangular signal of duration r,, 

[A, < t < Ti, 

(0, / <0, ! > 7",, 

is to be detected in the presence of Gaussian noise that is a combination of white 
noise and stationary noise having a Lorentz spectral density as in Example 2-2. 
The input to the receiver is observed during an interval (-T, T) that is much 
longer than both the signal and the width fx" 1 of the autocovariance function of 
the nonwhite component of the noise. 

Determine the impulse response of the optimum linear filter for detecting 
this signal with maximum probability Qj of detection for preassigned false-alarm 
probability Qo. Assume that T » T l7 whereupon (2-77) is appropriate. Sketch 
the impulse response and describe bow the receiver utilizes this filter. Sketch the 
signal component of the output of this filter under hypothesis H\. 
(b) Show how the optimum receiver of part (a) can be decomposed as in Fig. 2-4 into 
a whitening filter followed by a second linear filter. Give the transfer function and 
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-J 1 1 1 1 1 1 1 ! i 1 1 I 1 I I ' » ' ' ' i i ■ 

0.0 0.2 0.4 0.6 0.8 - 1.0 

t(T 

Figure 2-11. Solution ?(r) of the detection integral equation (2-91) with the spec- 
tral density of Problem 2-12 and a constant signal s{t) s 1; jiT = 2. Curves are 
indexed with values of the parameter 4<t>o/nA'. 

the impulse response of the whitening filter for the signal and noise specified in 
part (a), and calculate the signal that appears at the output of the whitening 
filter when the rectangular signal sit) is present at the input. Sketch this output 
signal and give the impulse response of the second member of the cascade, 
(c) Calculate the maximum probability Q d of detection attained by the optimum re- 
ceiver of part (a) in terms of the preassigned false-alarm probability Q and an 
appropriate signal-to-noise ratio di, and calculate that signal-to-noise ratio. 

Hint: This problem is most simply solved in the time domain by making use 
of the convolution theorem. Compare with Example 2-2. 
2-15. A signal whose spectrum 5(w) is bandlimited, 

s A, -ttW < a) < irW, 
= 0, \b)\ > ttW, 

is to be detected in the presence of stationary Gaussian noise of autocovariance function 

a 1 + t 2 

Observation during an infinite interval is permitted. Calculate in terms of the error- 
function integral the maximum possible probability Q d of detection for a fixed false- 
alarm probability Q . 

2-16. A binary communication system is to transmit messages coded into O's and Ts, which 
occur independently every T seconds and with equal relative frequencies. Two systems 
are contemplated. In system (a) the l's are transmitted by sending a signal that is 
received as af(t), which falls entirely within the interval (0, T); for O's nothing is sent. 
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This is called an on-off system. In system (b) the Vs are transmitted by sending a signal 
received as bf(t), the O's by sending one received as —bf{t)\ these are called antipodal 
signals. The signals are received in white Gaussian noise of unilateral spectral density 
N. For each system determine the receiver that minimizes the average probability P t . 
of error. Find the ratio of the average power transmitted by system (a) to that required 
by system (b) in order that the error probabilities P ( , be equal. 
2-17. A signal Af{() of known form and amplitude is to be detected in white Gaussian noise 
of unilateral spectral density N. Fix the cost matrix C and the prior probabilities £ 
and £| of hypotheses Hq and H\. Show that the false-alarm and detection probabilities 
are given as in (2-72) with 

x = ^InA + ^, d = ^2E/N t 

and with Ao as given in (1-17). Find the minimum Bayes cost C ,„;,,(</) as a function 
of the signal-to-noise ratio d, and show that as d goes to zero, C mill (0) - C m ; n (cO is 
proportional, for d «. 1, to 

, F In 2 An 
cr exp 



Id 1 



This difference C m i n (0) - C irs i„(d) of Bayes costs therefore approaches zero faster than 
any power of d as d goes to zero. Hint: You will need the asymptotic form of the 
error-function integral in (1-11): 

erfcx « ~-L-e~* 1/2 1 1 - -Xr + -\ x » 1, (2-135) 



[Abr70, eq. (26.2.12), p. 932]. This can be derived by successively integrating (1-1 1) by 
parts. 

2-18. For a > IT, solve the integral equation 

-T < t < T. 



cos wt = <j>9 J ^1 - ^—~-^Jq(s) els. 



Hint: By differentiating the triangular kernel $(t - s) twice, show that it satisfies the 
differential equation 

where F(t) involves delta functions. Determine F(t) and then use the technique of 
Sec. 2.3.1. 

2-19. Consider a pulse-amplitude-modulated (PAM) system in which messages arc coded into 
an odd number M = 7n + 1 of symbols corresponding to signals received as 

sj{t) = AjjV), ['[/(I)} 2 dt = 1, 1 <j<M- In + 1, 
Jo 

during an observation interval (0, T). The amplitudes A\, Ai, ... , A M are uniformly 
spaced about zero: A„+] - 0. The noise is white and Gaussian with unilateral spectral 
density N. The signals occur with equal relative frequencies M~ x . Determine the 
optimum receiver for deciding among these PAM signals and calculate the probability 
of error as a function of the average received energy E and the noise spectral density 
A'. Observe the relevance of Example 1-1. 
2-20. A ternary communication system transmits one of three signals /(/), 0, or -f(t) every 
T seconds. They are received with energy E or energy in white Gaussian noise of 
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unilateral spectral density N. At the end of each interval (0, T) the output x of a 
filter matched to/(/) is measured and compared with decision levels +a and -a. If 
x > a, the decision is made that +/(r) was sent; if x < a, that -/"(/) was sent; and if 
-a < x < a, that was sent. What is the probability Q c = 1 - p e of a correct decision 
as a function of a, E, and TV when all three signals are sent equally often? What is the 
maximum possible value of Q c and for what value of a is it attained? 
2-21. A quaternary communication system transmits every T seconds one of four equally 
likely signals as in (2-124). The functions /i(0 and f 2 (t) are orthogonal, and the signals 
are received with equal energies E in white Gaussian noise. The receiver has filters 
matched to f\{t) and f 2 {t) and observes their outputs j>, and y 2 at the end of each 
interval (0, T). In terms of these, specify the strategy that minimizes the probability of 
error in deciding which signal has been received. Calculate that minimum probability 
of error as a function of the signal-to-noise ratio. Hint: A judiciously selected rotation 
of axes in the signal space much simplifies this calculation. 

2-22. Every T seconds a quaternary PAM communication system transmits one of four sig- 
nals, 

A=a\f(t\ B=a 2 f{t) i C = -a 2 f(t), D~- a] f(t), < a 2 < a } . 

The signals are sent with equal relative frequencies. They are received with a common 
attenuation p. in the presence of white Gaussian noise of spectral density N> that is, if 
A is sent, iia, fit) is received, and so on. At the end of each interval (0, T) the output 
y of a filter matched to/(/) is compared with three decision levels b, 0, and -b, and 
the decisions about the transmitted signals are made on the basis of the scheme 

y > b -> A; Q < y < b -> B; -b<y<0->C; y<~b~*D. 

Calculate the average probability of error in deciding among these four signals, and 
choose the value of b that minimizes it. Show how you would determine the values 
of the amplitudes a, and a 2 to make this minimum probability of error as small as 
possible under the constraint of fixed average transmitted power. 
2-23. In a communication system sending messages expressed in an alphabet of four symbols 
A, B, C, and D, the transmitter sends nothing for each A. For each B it sends a 
signal received as s\(t). For each C it sends a signal that is received as -r 2 (/); s 2 (t) is 
orthogonal to s\(t), but does not necessarily have the same energy. For each D the 
transmitter sends a signal received as s\{t) + s 2 (t). The received signals are confined 
to an observation interval (0, T). Successive message symbols appear every T seconds, 
are statistically independent, and occur with equal relative frequencies £. The signals 
are received in white Gaussian noise with unilateral spectral density TV, Describe the 
receiver that decides among the four possible signals with minimum probability P e of 
error, and calculate that probability in terms of N and of the energies E t and E 2 of 
signals si(t) and s 2 {(). 

2-24. A nonary communication system sends messages coded into an alphabet of nine sym- 
bols, which occur with equal relative frequencies 1/9. In terms of two functions 
and/ 2 (/) that are orthonormal over the observation interval (0, T), the nine signals, as 
received, are as follows: 

si(t) a 0; s 2 (l) = AMi); s 3 (t) = A\f x (t) +f 2 (t)); 

= Af 2 (t); j 5 (0 = A[-Mt) +f 2 (t)]; s 6 (t) = -Af^f); 
s<t) = -A[M0 +Mt)]; s s it) = -Af 2 {t)\ s 9 it) = A[Mt) ~f 2 {t)\ 
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•The signals are received in white Gaussian noise /?(/) during observation intervals that 
wc denote as usual by (0, T). 

(a) Draw a diagram representing these nine signals in an appropriately defined signal 
space based on the functions f\{t) and fi{t). 

(b) Determine the optimum receiver for deciding with minimum probability P t , of error 
which of these nine signals is present in the input 

v{i) = n(t) + s k (t), 1 < k < 9. 

Use the signal-space diagram to indicate the decision regions. 

(c) Calculate the probability P e of error for the optimum system. Express it in terms 
of the average signal power received and the noise spectral density N, 

(d) Taking 

(2\ ]/2 . (2\ l/2 . 2mfc 

MO = l~J cos wt, MO = [j) sin " ,f - = -y-* 

for some integer k, determine the nine signals in the simplest possible form. 
2-25. Jn the quaternary communication system characterized by (2-124), the functions f k (t) 
are normalized as before, but they arc not orthogonal: 

flMOfdt = U * = 1,2; 
Jo 



f MOMOdt = r, < \r\ < 1. 
Jo 



Thus the signals have equal energies, but the outputs at / = 7' of fillers matched to 
them arc correlated. Their relative frequencies are still equal to \, and they are re- 
ceived in white Gaussian noise of unilateral spectral density N. Calculate the minimum 
probability P t , of error in deciding among these signals, and determine the receiver that 
attains it. 
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3 



Narrowband Signals and 
Their Detection 



3.1 NARROWBAND SIGNALS AND FILTERS 

Because an antenna of convenient size radiates most efficiently at high frequen- 
cies, the signals transmitted in radio communications and radar consist of a high- 
frequency carrier modulated in amplitude or frequency or both. The strategy de- 
veloped in Chapter 2 for detecting a signal in Gaussian noise required that its form 
s(t) be completely known to the receiver. More often than not, however, the phase 
of the carrier of a received radio or radar signal is unknown, and other parameters 
such as its amplitude, its arrival time, and even the frequency of its carrier may also 
be unknown or uncertain within a wide range of possible values. We must broaden 
our theory to accommodate such uncertainties in the signals to be detected. Before 
entering on the necessary modifications to the general theory, we shall discuss its 
simplest extension, which treats the detection of a high-frequency signal of unknown 
carrier phase. First we must review the concept of modulation and introduce a con- 
cise notation for expressing modulated signals and the properties of the noise that 
accompanies them. 

A simple radar pulse may look somewhat like the signal depicted in Fig. 3-1, 
for which s(t) can be written as 



A pulse of this kind usually contains a large number of cycles of the radio-frequency 
carrier; in a typical such signal the carrier frequency fl may be 2tt • 10 9 rad/sec (a 
1000-MHz carrier), and the duration T may be 1 fxsec = 10" 6 sec. 




< t < T, 
t < 0, 



t > T. 
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Figure 3-1. Narrowband pulse. 



This kind of signal is said to be amplitude modulated, for only the extent of the 
oscillations of the carrier is affected. The outputs of certain broadcasting transmitters 
are also amplitude modulated and can be described by the expression 



The function m(t), which changes only slightly over a period 2-n/il of the carrier 
cos Ht, represents the voices and music being transmitted. Care is taken to prevent 
the factor [1 + m(0] from becoming negative. The signal in (3-1) looks like a sinusoid 
with an irregularly fluctuating amplitude. 

The output of a transmitter modulated in frequency can be written in the form 



The instantaneous frequency of this signal is fi + \v{t), and the slowly varying func- 
tion »!•(/) carries the information being broadcasted. Such a wave has a constant 
amplitude, but the limes at which it crosses the zero level shift about with the mod- 
ulation. 

A high-frequency carrier with the most general kind of modulation can be 
represented by the equation 



We shall call the function M(t) the amplitude modulation and the function <)>(/) the 
phase modulation; the derivative cl^/dt is Ihc frequency modulation of the signal. All 
these modulations vary much more slowly than the radio -frequency carrier. Some 
radar signals are modulated in both amplitude and frequency (or phase) in this 
way. 



s(0 = A[\ + m(0] cos 12/. 



(3-1) 




.v(0 = M(!)cos[ilt + <!>(/)]. 



(3-2) 
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Figure 3-2, Demodulating receiver. 



By expanding the cosine in (3-2) we obtain 

s(t) = M(t)[cos (j>(0 cos fit - sin i}>(0 sin fit] 
= X{t) cos Or - Y(t) sin Or 

with 

= M(t) cos <K0, HO = M(t) sin 4>(/); (3-4) 

X(t) is often called the in-phase component and Y(t) the quadrature component, 
but we shall designate the pair of them as the quadrature components of the signal. 
They too change only slightly during one cycle of the carrier cos fit. The function 
X(t) can be extracted by multiplying the signal s(t) with the output 2 cos fit of a 
local oscillator (L O) and filtering off the components of the product with carrier 
frequency 2ft by means of a low-pass filter (LPF): 

2s(t) cos fit = X(t)(\ + cos 2flt) - Y{t) sin 2ft/ -» X(t), 

A device accomplishing this is called a mixer or a homodyne detector. It is said to 
mix or beat together the signals s(t) and 2 cos fit. The other quadrature component 
Y(t) can be extracted by mixing s(t) with the signal 2 sinftf. Figure 3-2 shows 
the block diagram of a receiver whose input is the modulated signal s(t) and whose 
outputs are the quadrature components X(t) and Y(t). 

By combining these quadrature components into a complex function of the 
time, we obtain a convenient representation of a modulated signal in terms of its 
complex envelope 

F(t) = M(t) e m '\ (3-5) 
and we write the signal compactly as 
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(3-8) 



s(0 = Re i^(/) (3-6) 

Re stands for the real part of the complex number following it. The complex function 
F(t) can be pictured as a vector at the origin of the XY -plane. The end of the vector 
moves about in the plane, and all the while the plane itself rotates with an angular 
velocity ft. The signal s(t) is the projection of this rotating vector on a fixed line. 
The complex envelope of a modulated signal is a natural generalization of the phasor 
representation of alternating currents and voltages. When the motion of the vector 
F(t) within the rotating plane is much slower than the rate of rotation, the signal 
s(t) is said to be quasiharmonic. 

If the signal is a pulse of finite energy, its complex envelope possesses a Fourier 
transform 

/-CO 

/(«)= F(t)e~^ dt, (3-7) 

v —00 

in terms of which the spectrum S(o>) of the signal is 

s(t) e~ iM dt = U [F(t) e ia! + F\t) e^yi** dt 

-00 J— 00 

= ft) +/*(-« -n>]. 

Because the quadrature components of F{t) vary much more slowly than the carrier 
cos fit , the width in frequency of the modulus }/ (w)| of its Fourier transform is much 
smaller than ft. The modulus \S(<u)\ of the spectrum of the signal then exhibits two 
narrow peaks, one near the frequency ft and the other near -ft. Because of this 
structure, s{i) is called a narrowband signal. 

The spectrum in (3-8) satisfies the condition S(-a>) = S*(<o) imposed by the 
reality of the signal s(t). The Fourier transform /(a>) of the complex envelope satisfies 
a similar condition if F(t) is real and the signal is purely amplitude modulated. Only 
then will the modulus |/(w)| be an even function and will the peaks of |S(a>)| be 
symmetrical about the carrier frequency ft. Indeed, the carrier frequency is quite 
arbitrary. Shifting it by an amount k simply introduces a factor exp(~ikt) into the 
complex envelope, 

F(t)e ia ' = [f(/)e-*']e'< n+ *>', 

without changing the signal s(t). An appropriate choice of the carrier frequency ft 
sometimes simplifies a calculation. 

If a narrowband signal is not presented in the explicitly modulated form of 
(3-2), its complex envelope can be derived from what is called the analytic signal. 
We write s(t) as the sum of a positive-frequency part s+{t) and a negative-frequency 
part 5„(/), which are defined in terms of its spectrum S(ta) by 

s + (t) = fsiu) e'"' ~, s-(t) = st(t) - f ° t »< p.. (3-9) 
Jo ^ir J-oo 2tt 

For a narrowband signal the positive-frequency part of the spectrum is concentrated 
in the neighborhood of an angular frequency ft, and the complex envelope can be 
defined as 

F(t) = 2s+(0e*~ /flf . 
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The analytic signal s+(t) was introduced by Gabor [Gab46]. It is more general than 
the complex envelope, for it can be defined for any signal that possesses a spectrum. 
In terms of the signal s{t) itself, it is given by the integral 



in which z has an infinitesimal negative imaginary part. The real and imaginary 
parts of the analytic signal are related by the Hilbert transform, for which we refer 
the reader to [McD56], [Gui63, Ch. 18], and [Sch66, Sec. 1.6]. The definitions in 
(3-9) and (3-10) are of little practical value, and the analytic signal is useful mainly 
when the signal s(t) is quasiharmonic, whereupon the complex envelope serves as 
well. This and other definitions of complex envelopes have been reviewed by Rice 
[Ric82j. 

A filter whose output is a function of the amplitude modulation of a narrow- 
band signal applied to its input is known as a rectifier. It must be nonlinear, for a 
linear filter could not remove the oscillations of the carrier without destroying the 
envelope as well. A typical rectifier whose output is related directly to the amplitude 
modulation of its input is the quadratic rectifier. It first squares its input, yielding 



after which a low-pass filter removes the terms with frequencies in the vicinity of 2X1, 
so that the final output is proportional to \F(t)\ 2 = [M(t)f ~ [X(t)] 2 + [Y(t)f. This 
we call a quadratic rectifier. Rectifiers having other than a quadratic characteristic 
produce some other monotone function of the absolute value \F(t)\ = M(t); a linear 
rectifier yields M(t) itself. 

A device whose output is proportional to the instantaneous frequency deviation 
<J>'(0 is known as a discriminator, its output can be taken as proportional to 



in terms of the complex envelope F{t) of the input. Any given discriminator or recti- 
fier circuit must of course be analyzed to determine the accuracy of these descriptions 
of its action on the input signal. 

If we integrate (3-11) over -oo < t < oo and recognize that for quasiharmonic 
signals the integrals of the terms proportional to cos 2ft* and sin 2£lt will be much 
smaller than the others, we obtain 



for the energy of the signal. 

Quasiharmonic signals are often transformed by means of linear filters that 
attenuate components of all frequencies except those in the neighborhood of the 
input carrier frequency. One purpose of such filtering is to eliminate noise lying 
outside the frequency band of the signals. The analysis of these pass filters illustrates 
the simplification brought about by the complex notation introduced here. 
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(3-10) 



by (3-3) 

mf = L 2 {[x(t)] 2 +■ [Y(t)f} 



(3-11) 



+ jW)f - [Y(t)] 2 } cos 2Qt ~ X(t)Y(t) sin 2£lt, 



4-lm[\nF(t)} 




J 



We shall deal with linear narrowband pass filters that least attenuate those 
components of their inputs whose frequencies lie in a range of width W about some 
high frequency ft, with W much smaller than ft, W <k ft. It is convenient to write 
the transfer function of such a filter in the form 

= y{<* ~ a) + r <-*> - n), (3-12) 

in which the complex function y(to) differs significantly from zero only over a nar- 
row range of frequencies about w = 0. Equation (3-12) satisfies the condition of 
symmetry y(~ui) - y*(w), which is a consequence of the reality of the impulse re- 
sponse 

r CO j 

*0O = \ J^) e ^^ (3-13) 

of the filter. If a linear narrowband pass filter consists of lumped circuit elements, the 
poles of its transfer function _y(co) lie in the neighborhood of co = +ft and o> = -ft. 
We can then decompose ;>(w) as in (3-12) from its expansion in partial fractions, 
taking the terms with poles near w = +ft into y(u> - ft) and leaving the rest for the 
term y*(~m - ft). 

By means of (3-12) and (3-13) we can write the impulse response &(t) of the 
narrowband filter as 

(■CO , 

*(t) = [Y(<a - ft) + ¥*(-<* - ft)] e tm ~ 

J-oo 2tt 

ICO J f CO t 

y(a)) e '(» + n)T«^ + y . {a)) e -i(» + niT£^ 
-co 2tt J_co 2tt 

= 2Re^(T)e ,nT , 

where 

K(t) = j Y(<o)e iu "~. (3-14) 

If is significant only over a range of frequencies of width W, the function 
K(t) changes appreciably in a range of values of t whose width is of the order of 
\/W , By analogy with the concept of the complex envelope of a narrowband signal, 
we can consider K(t) as one-half the complex envelope of the impulse response of 
the narrowband filter. When such a filter is excited by a sharp impulse, it "rings," 
and its output oscillates with frequency ft; this output decays with time in a manner 
described by the envelope 2K(t). 

We shall now show that the envelope function K(t) transforms the complex 
envelope of the input signal to produce that of the output signal in much the same 
way as the impulse response k(r) acts on the signal itself, 

J '00 
k(r)siO -r)dT, (3-15) 
o 

the subscripts i" and o denoting input and output, respectively. The spectra of the 
input and output signals are related by 

S tf (o>) = y(cD)S,-(o>), 
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and if we write them in the form (3-8) and use (3-12) for y(u>), we find 

= [y(o> - o) + - fi)][^ Ma - ft) + - ft)] 
= \ y(o) - a)/,(o) - ft) + £ r *(-6> - ft)/-*(-o) - ft) 

+ i y - o)/(w - a) + ± y(w - a)f*(~(a - o). 

For quasi harmonic signals and narrowband filters the last two terms are much 
smaller than the first two, and we can write approximately 

/ (<o - ft) « y(o> - n)/-(o, - ft) or / fl (a>) * r(w)/i(«), 

from which, by the convolution theorem for Fourier transforms, we find for the 
complex envelope of the output signal 

J -co 
K(-r)Fi(t-T)di. (3-16) 
o 

By comparing the terms we dropped with those we retained, we see that the rel- 
ative error we committed is on the order of the magnitude of |F(2ft)/F(0)| or 
!//(2fl)///(0)|, whichever is the greater. Within this approximation, we can use K(t), 
which we call the complex impulse response of the narrowband filter, in the same 
way when dealing with complex envelopes as we ordinarily use the original impulse 
response k(i) when dealing with the signals themselves. 

Let us illustrate these results for a filter consisting of a simply resonant circuit, 
as shown in Fig. 3-3. When the input and output are measured across the terminals 
shown, the transfer function is 

. . R 2i'|x<j) R 2 1 

R + z(coL - ~c) wo + 2ifjLto - w 2 2L LC 

The poles of y(oi) lie at co = ifi + v and o> = i> - v, v 2 = too - uA For a narrow- 
band or "high-Q" filter, u, <c coo; and we can take the pass frequency as ft = v, 
which is close to the resonant frequency wo. Then it is simple to show that F(o>) 
satisfies (3-12) when it is defined by 

Wq,) = ^ ~ iv ) ~ 1 

v(ai — ip.) 1 + icd/(x 

By (3-14) the complex impulse response has the simple form 

KM = ^(v + m) c-^E/Ct) « a e-^U(i), (3-17) 
v 

where (/(■) is the unit step function. For an input consisting of a carrier of frequency 
near ft whose modulation varies slowly, it is simpler to calculate the modulation of 
the output by means of (3-17) and (3-16) than to use the usual methods embodied 
in formulas like (3-15), in which k{i) is a somewhat more complicated function. 
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Figure 3-3. Simply resonant circuit. 



3.2 NARROWBAND NOISE 



3-2. 1 The Complex Representation 



The antenna that picks up a high-frequency radio or radar signal and the waveguide 
or coaxial cable that conveys it to the receiver respond significantly only over a 
relatively narrow range of frequencies about the carrier frequency of the signal. The 
input v(t) to the receiver has in effect passed through a narrowband filter of the 
type specified by (3-12). Its passband is usually broad enough so that the filter does 
not appreciably distort the signal, but it converts the thermal noise into noise with 
a spectral density concentrated in the neighborhood of the pass frequency O of the 
input. Even if the input circuitry did not have such an effect on the noise, it would 
generally be convenient, before processing the input further, to remove from it the 
noise in frequency ranges far from that of the signal by passing the input through 
just such a narrowband filter as that in (3-12). The output noise w(r) will then have 
a spectral density of the form 



4>(co) = <E>(o) - H) + <J>(- W - il), 



(3-18) 



in which the function <I>((o) is real and nonnegative and differs significantly from zero 
only over a range of angular frequencies of width W «: fl. The autocovariance 
function of this noise is 



4*0 = 



<t>(w) e u 



„/(H+fl)T 



[4>(co - a) + *(-q> - n>] c fwT ^ 

2tt 



[*(«) e' ( " +I1)T + $(w) e~ il ' l+ ^]~ = Rfi[4( T ) e 

2tt 



, du 



where the function 4>(t), generally complex, is defined by the Fourier transform 

.din 



<Kt) = 2 *(o>) e" 



2ir.' 



(3-19) 



and 
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O(o>) 



ICO 
$(t) e~ ia " dr. (3-20) 
-00 



We call <f>(w) the narrowband spectral density and 4>(t) the complex autocovariance 
function of the stationary narrowband noise n(t). 
The variance of this random process is 

100 J 
. (3-21) 

The factor 2 in (3-19) and (3-21) can be remembered by keeping in mind that the 
total power in the process is obtained by integrating the spectral density over all 
frequencies, both positive and negative. That spectral density (3-18) has two peaks, 
each of which contributes half the power. 

When noise of spectral density $/(<t>) passes into a narrowband filter whose 
transfer function is given by (3-12), the spectral density of the output is, according 
to (2-9), 

o (a)) = * f (*)b(«)l 2 = y (a) - ft) + y *(-o> - ft)| 2 
■ » ^(uop r(a> - a)\ 2 + | r(-a> - n)| 2 ]. 

When the absolute square in the first line of (3-22) is expanded, the cross-product 
Re[ Y(ti> - ft) Y * (-W - ft)] can be neglected because for narrowband filters it is much 
smaller than the other terms. If the spectral density of the input noise is much 
broader than the transfer function of the filter, we can replace <3>*(<i>) by its value at 
ft, and by comparison with (3-18) we find 

*,(«) = |r(<a)1 2 *i(0) (3-23) 

for the narrowband spectral density of the output at all frequencies w where this 
function is of significant magnitude. If, on the other hand, the input noise is nar- 
rowband, with a spectral density like that in (3-18), we find 

<Mo) = |y(o))| 2 *,(«) (3-24) 

by discarding all the terms in (3-22) that are small in this approximation. This 
equation has the same form as (2-9) and shows that we can use the narrowband 
spectral density and transfer function in much the same way as we use the original 
functions <3>(o>) and y(<ii). 

In noise n{t) of this kind, frequencies in the neighborhood of ft predominate, 
and we should expect to write it in the quasiharmonic form 

n(t) = ReN(t)e i£lt , (3-25) 

where 

N(t) = X{t) + iY(t) (3-26) 

is a complex envelope whose quadrature components X{t) and Y(t) are random 
processes varying much more slowly than the carrier. 

Indeed, a stationary random process with spectral density <E>(co) can be con- 
sidered as a succession of randomly occurring pulses s(t — t m ) whose spectra are 
proportional to 

S(w) = [*(w)] 1/2 e is((a \ (3-27) 
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with g(bi) an arbitrary phase factor, 

CO 

"(0 = X a '» s ^ ~ (3-28) 

ltl~ — CO 

the amplitudes a„, are independently random with zero expected values, and as we 
said in Sec. 2.1.5, the epochs t m of the pulses form a Poisson point process. Observe 
that (3-28) is the same as (2-30) when we put s(t) = k(t), whereupon S(o>) = y(to). 
By (2-9) with (J>(w) = N/2, the spectral density of n(f) must be proportional to 
bMl 2 - |5(o))| 2 . 

When the spectral density <t>(a>) has the form of (3-18), the spectrum S(ia) in 
(3-27) will have the similar form (3-8), and the component pulses in (3-28) can be 
represented as 

s(t - /,„) - Re[F(t - t m ) ex^ - /,„)]. 

in terms of a complex envelope F(l). As a result, the noise n(i) in (3-28) has the 
form of (3-25) with a complex envelope 

CO 

N(t) = ]T a,„ Qxp(i^„)F(i - t„,\ <b„, = -a/,„, (3-29) 

m = — ca 

which represents a sequence of complex pulses F(t - t m ) with random amplitudes 
a m and random phases We shall now study the properties of narrowband or 
quasiharmonic noise of this kind, and in particular we shall relate its autocovariance 
function to those of its quadrature components X{t) and Y{t). 

3,2.2 The Complex Autocovariance Function 

Quasiharmonic noise is not necessarily stationary. In a scatter-multipath communi- 
cation system, for instance, narrowband transmitted pulses impinge on a myriad of 
scattercrs randomly located in the ionosphere, each of which reradiates a pulse of the 
same form. These scattered pulses combine with random amplitudes and phases to 
produce a received signal much like that in (3-28) and (3-29), except that the epochs 
t m do not stretch from -co to co, but occur only during a limited interval determined 
by the thickness of the scattering layer and the angles of the incident and scattered 
beams. A similar phenomenon occurs in radar astronomy when a narrowband radar 
pulse is reflected by numerous randomly located points on the surface of a rough 
planet. In order later to treat the detection of random quasiharmonic signals of this 
kind, we allow the process n(i) = Re N(t) exp Hit to be nonstationary. 
Writing the process as 

n(t) = ±[N(t) e ia ' + N*(t) <T' ilf ], 
wc form its autocovariance function 

(|)(/i, h) = FindOnihJ] 

= i{£[^(/i)//(/ 2 )]exp[/n(/i + t 2 )\ + ex. ( 3 .30) 
+ E[N(f l )N*(i 2 )]exp[ia(ti ~ * 2 )] + c.c.}, 
where c.c. denotes the complex conjugate of the preceding term. 
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In naturally occurring quasiharmonic random processes, whether stationary or 
not, the term E[N(ti)N(t2)] must vanish, for otherwise the process would provide 
information about the phase of the carrier, and it would appear as though the 
process were governed by an inherent "clock" with which we could synchronize our 
own clocks. The variance of the process, for instance, would have the form 

Var n(t) = E{[n(t)]} 2 = i£{|tf(r)l 2 } + 2 cos(20* + 6), X = i|£{[iV(0] 2 J|, 

and the variance would pulsate at frequency 20 with a determinate phase 6. When 
we assert that 

E[N(h)N{t 2 )] = 0, V(r,,* 2 ), (3-31) 

we are denying the existence of a clock inherent in the ensemble of temporal functions 
n(t) that, equipped with a probability measure, defines the random process. We 
assume that for each realization \te[N'(t) exp iClt] in the ensemble, the ensemble also 
contains the process Re[iV'(0 exp ii|f exp itlt] for all possible phases i|j in (0, 2tt), no 
phase being preferred over any other. This means that when we form the expected 
value in (3-31), we are forming 

E[N'(h)N'(t2) e*+] 

and taking an average over a phase i|» uniformly distributed over (0, 2ir), in addition 
to the average over the totality of complex envelopes N'(t). Because 

E{eW) = £(cos 2vf/) + z£(sin 2i|») = 0, 

that expected value must vanish as in (3-31). 
The second expected value in (3-30) is 

\E[N{t{)N*(t 2 )) = \E[N'{h)N'*{t 2 )} 

by the same reasoning, and the phase i]/ drops out. We designate this expected value, 
which does not vanish in general, by 

Wuh)=\E[N{h)N*{h)l (3-32) 

and we call it the complex autocovariance function of the nonstationary complex 
random process N{t). In terms of it the autocovariance function of the quasihar- 
monic random process n{i) is 

<f>(*i, h) = Re[$(f|, h) exp iO(t\ - f 2 )] 

from (3-30). The complex autocovariance function $(/] , t 2 ) possesses the Hermitian 
property 

§(tut2) = §*(t 2 ,h), (3-33) 

which follows immediately from its definition (3-32). 

The autocovariance and cross-covariance functions of the quadrature compo- 
nents X(t) and Y(t) defined by (3-26) are 

ElXOOXih)) = E[Y(h)Y(t 2 )] = Reta, hi 
E[Y( t] )X{t 2 )] = -E[X(n)Y(t 2 )] = Im $(*,, t 2 ). 
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Unless the complex autocovariance function <j>(/i, t 2 ) happens to be real, the real 
and imaginary parts of the complex envelope N(t) are correlated random processes, 
except for their values observed at the same time t\ = t 2 . 

The complex autocovariance function <j>(/i, t 2 ) of a narrowband random pro- 
cess n{t) = Re N(t) exp itit is nonnegative definite because for any complex function 
g(0, 



m\ T g*w(t)dt 

Jo 



>0, 



so that 

g*(t)Mt),t 2 )g(t2)dhdt2>0, 



Jo Jo 



which is the criterion that the kernel <j>(?i , t 2 ) of such a quadratic form be nonnegative 
definite. We shall assume, as in (2-40), that <j>(*i, t 2 ) is positive definite, so that for 
no function g{i) except g(t) = does 



g*WV)dt 



vanish. 

When the quasiharmonic process n(t) is stationary, its complex autocovariance 
function ej>(/ t , t 2 ) depends on the times t\ and t 2 only through their difference t = 
t\ - t 2 , 

$(tu h) -+ $(t\ - 1 2 ) = i(j) 9 

and this complex autocovariance function 4>(t) is the same as that defined by (3-19) 
in terms of the narrowband spectral density 4»(<o) of the process. The average power 
in the process 

n(f) = X(t) cos tit - Y(t) sin tit 

= X'{t) cos(Oz + - Y\t) sin(0/ + iji), N'(t) = X'(t) + iY'(t) t 

is 

E[n(t)f = 4(0) - lE[\N(t)\ 2 ] = \{E[X(t)f + E[Y(t)] 2 }. 

Remembering that all phases i|j are equally likely, we can think of the \ appearing 
here as representing the averages of the factors cos 2 (Hr + i|>) and sin 2 {tit + \\s) that 
figure in [n(t)] 2 . 

The Hermitian property (3-33) of complex autocovariances implies that . 

$(-t) = 4*{t). 

The real part of $(t) ■= $ a .(t) + i$ y (7) is therefore an even function and the imagi- 
nary part an odd function: 

4,(-T) = $ x (t), ^(-t) = -$ ; .(T). 

As a quasiharmonic function, the random process n(t) must have an amplitude mod- 
ulation \N(t)\ and a phase modulation arg N(t) of the same kind as described in 
Sec. 3.1. From the statistics of «(/) one can determine the probability distributions 
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of the outputs of rectifiers, discriminators, and other devices whose inputs are nar- 
rowband signals and noise. The statistical properties of the envelope of narrowband 
noise have been treated in [Ric44], [Mid48], [Bun49], [Are57], [Dug58], and [Ree62]. 
When the noise is both narrowband and Gaussian, its joint probability density func- 
tions can be put into an especially simple form that facilitates calculations. We turn 
now to developing it. 

3,2.3 Circular Gaussian Random Processes 

Just as in Chapter 2 an ordinary random process was sampled by means of the 
coefficients in its Fourier expansion in terms of a set of orthonormal functions, so 
can we sample a narrowband random process n{t) by a similar expansion of its 
complex envelope N(t), 

00 

W) = X Zkfkit), 

k=l 

in which the functions f k {t) may be complex and are orthonormal in the sense of 
(2-39): 



The complex samples of the envelope are then 

+ m = f fkitWt) dt. (3-35) 
Jo 

From (3-32) we obtain the complex autocovariance matrix 4> of the samples z k \ its 
elements are 

hkm = \E{z k zi) = \\ T \ T f k \t)E[N(t)N\s)]f m { s )dtds 

J ° "J (3-36) 

- [ f fk(tMt, S )Ms)dtds. 
Jo Jo 

The matrix t|> is Hermitian, 

<i> + - 

where <|> + indicates a matrix derived from <j> by transposing rows and columns and 
taking the complex conjugate of each element; the km element of <\> + is and by 
(3-33) and (3-36) 

Furthermore, because of (3-31), 

E(z k z m ) - E(z k z*) = 0, V(&, m). 

From these relations we find the counterparts of (3-34): 

E(x k x m ) = E{y k y m ) = Re §km = ^ ^ 

E(y k x m ) ~ -E{x k y m ) ~ Im $> km = $ y , km , 
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where' <|> - ty x + ify y . Thus the 2n X 2n covariance matrix <f> of the In random 
variables x\, x 2 , ... , x n , y\,y 2 , ... , y» can be divided into n x n blocks 



:] 



(3-38) 



In particular (j>j, iWf - E^y^) = 0, and 



For instance, to the 4 x 4 matrix 



r 4 

2 

-I 



-1 


2 
3 



with 



corresponds the 2 X 2 Hermitian matrix 



-[-■ 

<j> = <j> A + i'*v = [ 2 - / 



2 + 
3 



Its inverse is 



7[ -2 + / 4 J' 



from which we can write down the inverse of the 4 X 4 matrix <j>: 



4>-' = 



3-201 
-2 4-1 
0-1 3-2 
10-24 



When the narrowband noise n{t) is Gaussian, the random variables x k and 
y k , k = 1 , 2, . . . , n , have a jointly Gaussian probability density function that is de- 
termined entirely by the covariance matrix <f> defined through (3-34), (3-37), and 
(3-38). In particular, the bivariate probability density function of the real and imag- 
inary parts of Zk is 



l 



1 



exp - 



[ 4 + 

L 2fc A J 



(3-39) 



Because of the circular symmetry of this probability density function p(x k ,y k ), the 
complex random variables z k = x k + iy k are called circular complex Gaussian ran- 
dom variables. 

The joint probability density function of the real and imaginary parts of n of 
these complex random variables has the circular Gaussian form 
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p(xi, ... , x n ,y], ... t y n ) = p(z u z 2l ■■■ , z„) 

= (2irr(det^) _I exp(-^ + ^" 1 z) 

(3-40) 



which is shown in Appendix B to be equivalent to the usual Gaussian density function 
involving the In X In matrix 4> of (3-38). Here fly are the elements of the inverse 
tjr 1 = p, of the Hermitian matrix <j> = & x + z * s an "-element column vector 
of the complex numbers z\, z 2 , ... , z n , and 

is its transposed conjugate row vector. Furthermore 



(det 4») 2 = det * = det; 

L 

is the square of the determinant of the n x n Hermitian matrix <j>. In our example, 
for instance, det $ = 7 and det * = 49. 

Equation (3-40) is only a concise way of writing the joint probability density 
function of the 2n random variables x\, X2, ... , x„, y\, yi, y n ; it should not be 
considered as the probability density function of zj, Z2> ••• » z„. When the real and 
imaginary parts of the samples z* of the complex random process N(t), defined as 
in (3-35), possess such a circular Gaussian probability density function, N(t) is said 
to be a circular Gaussian random process, and the complex envelope of narrowband 
Gaussian noise is a process of this kind. 

When a narrowband signal s(t) = Re 5(0 exp z'flz is present in addition to 
this narrowband Gaussian noise, the complex samples have expected values that 
are not zero, but are given by 

E{z k ) = S k = \ T f^t)S{t)dt, 
Jo 

and the joint probability density function of the samples Xk, yk becomes, in place of 
(3-40) 

p(z u z 2 , ... , = (Zirndeti)-' expj4X|>; ' S i^J ~ S J^ ^ 41 > 

The joint characteristic function of the 2n random variables x iy X2, ... , x n , 
y\,yi, ■■• ,yn can be written in a similarly concise form. It is the 2m -dimensional 
Fourier transform of the joint density function in (3-41), and Appendix B shows that 
it can be expressed as 

= h(w u ... , w„; w*, ... , w*) 

= E exp \i YiuxjXj + Uyjyj) I (3-42) 



L <f>v <f»* J 
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exp 



■ w. 



With II'/,- - + %,/ f . 

By writing 



h(w u ... , w„; w*, ... , vi',*) = E exp 



L P 



(3-43) 



and regarding vi'y and wf as mathematically distinct variables, we can derive the 
covariances of the samples of a circular Gaussian process by differentiating (3-42). 
Thus if we take Sj = for simplicity, we find, with all partial derivatives evaluated 
at wj = 0,y = 1, 2, ... , n, 



d 2 h 



3lV A * 3)1',; 



li'sO 



- 4>ki 



as in (3-36). Furthermore, 



d 4 h 



^E(z*z 2 z 3 z 4 ) - 4-= — = — 

OW\OW20W2, OW4 



(3-44) 



Formulas for moments of higher order have been given by Reed [Rec62]; they follow 
the same pattern as (3-44). With S k = 0, the expected value of a product vanishes 
unless the numbers of starred and unstarred factors are equal. When those numbers 
arc equal, one forms all possible pairs of starred and unstarred z\ as in (3-44), 
multiplies the expected values of the paired products, and adds. 



3.2.4 Narrowband White Noise 



Ordinary while noise cannot be expressed directly in terms of quadrature compo- 
nents, for its spectral density occupies much too wide a range of frequencies. In 
analyzing the detection of a narrowband signal in white noise, however, it would be 
convenient to represent the input noise in this way. As we said at the beginning of 
this section, we can always assume that the signal and the noise have passed through 
a filter whose passband includes the spectrum of the signal, but is much wider. Usu- 
ally it will be possible to treat this new filter as narrowband and represent its transfer 
function as in (3-12). It can have little effect on signal detectability, however, for it 
cuts out only noise components with frequencies far from those of the signal, affect- 
ing the signal hardly at all. Then one can conveniently write the white noise n(t) as 

n(t) = Re N(t) e' a \ N(t) = X{t) + iY(t), 

with 



Sec. 3.2 Narrowband Noise 



103 



E[X(n)X(t 2 )] = E[Y( tl )Y(t 2 )] = Nhity-ti), 
E[X(ti)Y(t 2 )] = 0, 
lElNitdN*^)] = Ntyi - hi E[N{t x )N(t 2 )} = 0, < 3 " 45 ) 

where iV is the unilateral spectral density of the white noise. These relations are 
consistent with the definition in (3-19) and with (3-23) and (3-24), in which both 
4>/(co) and <£/(&)) can be set equal to N/2. 

There is no formal difficulty with regarding white noise as narrowband noise 
whose spectral density is uniform over a range of frequencies — usually those of a 
quasiharmonic signal — of interest in many detection problems, and a considerable 
simplification follows from this viewpoint. Samples and 1 < k < n, of white 
noise, defined as in (3-35), are statistically independent, and by (3-36) and (3-45) 
their joint probability density function has the circular-complex Gaussian form 



The Hermitian quadratic form in the exponent of (3-40) is now simply the sum of 
the absolute squares of the z/s, divided by N. 



3.3 DETECTION OF A SIGNAL OF RANDOM PHASE 

3.3.1 The Likelihood Functional lor a Narrowband Input 

A narrowband signal 

j(0 = Re S{t) e iSlt 

with complex envelope S(t) is to be detected in the presence of Gaussian noise «(/) 
that for the reasons mentioned at the beginning of Sec. 3.2 we can also presume to 
be narrowband: 

«(/) = ReN(t)e iSlt . 

Its complex envelope N(t) is a circular Gaussian random process with complex auto- 
covariance function <{>(ri, t 2 ) as in (3-32). When as here the input v{t) ~ Re V(t) exp 
iflt is narrowband, both the real and imaginary parts of its complex envelope V(t) 
can be measured separately by mixing the input with 2 cos ft/ and 2 sin ftf, respec- 
tively, as described in Sec. 3.1 and illustrated in Fig. 3-2. We can therefore assume 
that the complex envelope V(t) itself is available to the receiver and that the receiver 
will base its decision between the two hypotheses 

H : V(t) = N(t), 

Hi: V(t) = S(/) + N(t), 0<t <T, 
on the complex envelope V(t). 
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The complex envelope V{l) is sampled by determining the complex coefficients 
V k of its Fourier expansion in terms of a set of functions/* (/) orthonormal over the 
observation interval (0, T) as in (2-39): 



Vk = h*{i)V{i)dt- 
Jo 

compare (3-35). Under hypothesis H Q the real and imaginary parts of V k are Gaus- 
sian random variables with expected values and a joint probability density function 
of the circular Gaussian form in (3-40). For simplicity we take the functions f k {t) 
as the eigenfunclions of the complex autocovariance function $(/ ( , t 2 ) of the noise: 

A*/a(0 = f $(t,u)f k (u)du. 
Jo 

Because as in (3-33) the complex autocovariance function is Hermitian, the eigen- 
values \k are real; andjweause 4>(/j, h) is positive definite, they are positive. Then 
the covariance matrix <j> of the samples V u V 2 , ... , V„ is diagonal, 

as is its inverse (j> -1 ~ ji figuring in (3-40); and the joint probability density function 
of the real and imaginary parts of the first n complex samples V u V 2 , ... , V„ takes 
the form 

i r w k ? 

exp 



^o(V) = n 



A- = ] 



2\ k 



(3-47) 



V stands for the set of n circular complex Gaussian random variables V u V 2 , ... , V n . 
Under hypothesis H { the expected value of the Arth complex sample V k is 



E(V k \H x ) = S k = [ r f k *{t)S(t)dt, 
Jo 



(3-48) 



where S(t) is the complex envelope of the signal. The covariance matrix of the 
samples and its inverse are unchanged, and the joint probability density function of 
the real and imaginary parts of the samples V k is, as in (3-41), 



I Vk - s k \ 2 



2kt 



(3-49) 



The decision between the two hypotheses is optimally based on the likelihood ratio 



&(V) 



fl CX P 

*=i L 



v*\ 2 - Wk - s k \ 2 



- exp 



k = \ 



k = \ 



2\ k 



(3-50) 



The data K], F 2 , ... , V n appear only in the first summation, whose real part is a 
sufficient statistic when only n samples are included. Again we utilize all the data 
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in the input v{t) by passing to the limit n -+ oo y and the decision is based on the 
sufficient statistic 

with 2a- = Sk/\k- By the same procedure as we used in Sec. 2.2, we can write our 
sufficient statistic as 

G =Re [ (?•(/) K(r)df, (3-51) 
Jo 

where is the solution of the integral equation 

S(t) = f r $(/, «)£?(") < t < T, (3-52) 

Jo 

whose kernel is the complex autocovariance function of the narrowband noise. The 
signal-to-noise ratio is now 



* = Z^ = £«ft= f V(0C<0 dt, (3-53) 



and the likelihood functional in terms of the complex envelope V{t) of the input is 
determined by taking (3-50) to the limit n -* oo: 

A[V(t)] = exp(G - \d 2 ) 

(3-54) 



= cxp^ReJ^(2*(/)K(0 A - \^S\t)Q{t)dt j. 



When the noise is white, Q{t) = S(t)/N by (3-45). 

The matched filter is now a narrowband filter with complex impulse response 

*W = jn Zn vr (3 " 55) 

The output of this filter at time t is a narrowband random process vo(t) ~ Re Fb(0 
exp jilt whose complex envelope, as in (3-16), is 

V {t) = X Q (t) + iY (t) = \*K(v)V{t - t) d7 
Jo 

= f Q\T - 7)V(t ~ t) ~ [ Q*(T - t + u)V(u) du, 
Jo Jo 

and the statistic G on which the decision is based is obtained by sampling the real 
part Xq(Q = Re Vo(t) of this output at the end of the observation interval, 

G = X Q {T) = Re V (T). 

The real part X${t) can be obtained by mixing the output of the filter with a locally 
generated signal 2 cos Clt. 
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3.3.2 Signals of Random Phase 

The detection strategy developed in Sec. 3.3.1 requires that the phase of the carrier 
of the signal be known precisely. If that phase is in error by the signal, instead 
of being Re S(l) exp iflt, is 

j(f;t|0 = Re 5(0 e iSi,+i +. (3-56) 

The signal component of the statistic G under hypothesis Hi will then be not d 2 
but 

E[G\H l ,*\ l ] = Ree^ f Q*(t)S(t) dt - d 2 cos ifi. 
Jo 

If the phase error is more than 90°, this component will be negative, and the 
probability of detecting the signal will be less than the false-alarm probability Q . 

Consider a radar system set up to determine whether a target is present at a 
certain distance from its transmitter. A typical signal has the form 

s(t) = Re F(t - / ) exp iCl(t - ? ), 

where F{t) is the complex envelope of the transmitted pulse, JQ is the carrier fre- 
quency, and t is the time when some distinguishing point of the echo from the 
target reaches the receiver. For narrowband signals the carrier frequency is so large 
that many cycles of the carrier occur within the duration of the pulse. The time t 
is proportional to the distance from the target to the receiving antenna. A small 
change in t Q on the order of (1/11), corresponding to an alteration in the distance of 
the target on the order of a wavelength of the radiation, makes a very small change 
in the envelope F(t - t Q ), but a large change in the carrier phase (~Slt ). [A typical 
value of the pulse duration is 10~ 6 sec; (I/fl) may be on the order of lCr 9 sec] 
The distance to the target will seldom be specified within a fraction of a wavelength 
of the radiation, and hence not precisely enough to determine the phase i|» = ~Q,t 
of the received echo within a fraction of 2tt, although one may be able to time its 
arrival within a small fraction of the width of the pulse envelope \F(t)\. In a com- 
munication system the phase i[» of the carrier of the signal may also be uncertain 
by a large multiple of 2tt unless the distance from transmitter to receiver is known 
within a fraction of a wavelength or the phase of the carrier has been tracked since 
the inception of transmission by some device such as a phase-lock loop. 

When the phase (J/ of the carrier is uncertain by a considerable fraction of 2ir 
or more, the observer generally has no reason to assign to it one value rather than 
another. In effect the receiver is called on to detect any one of a class of narrowband 
signals s(t; i|/) as in (3-56), with the phase t|i a random variable uniformly distributed 
over (0, 2ir). This is our first example of the detection of a signal one or more of 
whose parameters are unknown, A hypothesis of the form "A signal from a class 
of signals having parameter values lying in a certain range is present" is known as a 
composite hypothesis. 

When the complex envelope of the signal is S{t) exp ity instead of S{t), the 
signal samples S k in (3-48) must be replaced by S k exp iifc and the joint prob- 
ability density function of the real and imaginary parts of the complex samples 
V\, V % , ... , V„ under the now composite hypothesis H\ becomes, in place of (3-49), 
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^(V;^) = n(2Tr\ A )- ! exp^J ^ 



Vk-S k e'*\ 2 



(3-57) 



The actual joint probability density function of these data, taking into account the 
randomness of the phase tj/, is obtained by averaging (3-57) over < i|i < 2tt: 

Pi(V) = j'W;*)^- (3-58) 

Under hypothesis Ho the joint probability density function of the data remains j&o(V) 
as in (3-47). The likelihood ratio on which the decision is based is then not (3-50), 
but in the limit n — *■ oo 

A ( v)= lim r^^=[ 2 V(oi^, 

Jo Po(V) 2tt Jo 2tt 

where AjT(f)i ij;] is a likelihood functional obtained from that in (3-54) by replac- 
ing S(t) by 5(0 exp i\\s. By (3-52) Q(t) must be replaced by Q{t) exp ity, and the 
likelihood functional becomes, in place of (3-54), 

A[K(03 = j o 2 " ex p[ Re jV(O^C) dt - \^S\t)Q{t) dt J ^. 
In order to evaluate this average over the phase ty, we put 

T Q*(t)V(t)dt = R e i& , R = \ { T Q*(t)V(t)dt\ . (3-59) 
o Jo 



(3-60) 



Then with (3-53), 

A[F(?)] = J 2lI exp[J? cos(6 - *) - \d 2 ]^ 
= e- d2/2 h{R\ 

where ] 2k 

k(x) = r 2 V cos * p. = f (3-6D 

Jo 27T £ fl (A!)2 

is the modified Bessel function of order zero. 

The data V{t) now appear in the likelihood functional only through the quan- 
tity Io(R), which is a monotone function of R; and R, defined in (3-59), becomes a 
sufficient statistic for deciding between hypothesis H , "No signal is present," and 
the composite hypothesis Hi, "A signal s(i; *|>) is present with a phase \\t just as 
likely to have one value as another in (0, 2tt)." The receiver chooses hypothesis H\ 
if R > Ro and Hq if R < R , for some decision level Rq. Under the Bayes criterion, 

e~ d2/2 h(R ) = A , 

with Ao given by (1-17). Under the Neyman-Pearson criterion the value of Rq is 
chosen so that the false-alarm probability 

Qo = ?r(R > R \ Ho) = Cpq(R) dR (3-62) 

JRq 
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equals a preassigned value, P (R) being the probability density function of the statis- 
tic R under hypothesis H . The probability Q d (^) of detecting a signal s(t; with 
a particular phase is 



P\(R; ty) dR, (3-63) 

«0 



where P](R;\\i) is the probability density function of R when the signal s(i;\\>) of 
(3-56) is present in the input v{t). In the next section we shall calculate these prob- 
abilities. 

The statistic R can be determined by passing the input v{l) through the nar- 
rowband matched filter of (3-55) and rectifying its output v (t) = Re K (/) exp iilt. 
A linear rectifier produces |K (/)i, and the value of this at time t = T is the statistic 
R = \ V (T)\ required. 

If the noise n{t) is the result of passing white noise through an input filter 
whose passband is broad enough not to distort the signal, we can assume that 
n{t) is "narrowband white noise" of the type discussed in Sec. 3.2.4. Its complex 
autocovariance function is 

${t,u) = Nh(t-u) 

as in (3-45), and when this is taken as the kernel of the integral equation (3-52), its 
solution is Q(t) = S(t)/N, whereupon the test statistic R becomes 



S*(t)V{t)dt 

o 



(3-64) 



It is proportional to the rectified output, at the end of the observation interval, 
of a filter matched to any of the expected signals s(t; i|j). The average likelihood 
functional for detecting a signal of unknown phase in white noise is that in (3-60) 
with R now given by (3-64) and with the signal-to-noise ratio now 



T 



IE 



\S{t)\ 2 dt = (3-65) 

For narrowband signals the quantity d 1 in (3-53) or (3-65) is the same as that 
defined in (2-67) or (2-75). To show this, put 

s(t) = Re S(t) e iil < and ^(0 = 2 Re Q(t) e au 

into (2-67), expressing them in terms of their real and imaginary parts. The integrand 
will be found to have a group of terms with the factors cos lilt or sin lilt; these os- 
cillate much more rapidly than the other terms. With ilT » 1 in the quasiharmonic 
approximation, that group of terms contributes negligibly after integration, and we 
obtain the formula in (3-53) for d 2 . When the noise is white, that reduces to (3-65). 
We are justified, therefore, in calling the quantity d 2 in (3-53) the signal-to-noise 
ratio for the detection of narrowband signals. When the noise is white, it becomes 
the familiar 1E/N, where E is the energy of the signal and N the unilateral spectral 
density of the noise. 
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3.4 THE DETECTABiLITY OF SIGNALS OF UNKNOWN PHASE 



The performance of the receiver derived in the Sec. 3.3 can be evaluated on the 
basis of its probabilities Qo of a false alarm and Q d of detection. In that receiver 
hypothesis Hq is chosen when the statistic R of (3-59) is less than the decision level 
Rq; H\ is chosen when it is greater. The probabilities in question are given by (3-62) 
and (3-63), where Pq(R) is the probability density function of the statistic R when 
the input consists of noise alone; P\(R; \}f) is its probability density function when 
the input consists of noise plus the signal s(t; ij/) of phase ij/. In (3-63) we allow for 
the possibility that the probability of detecting the signal may depend on its phase. 
To determine these probabilities we write the statistic R as 

R = |z| = \x + iy\ = (x 2 + y 2 ) m , (3-66) 

where x and y are the real and imaginary parts of the complex random variable 

z = x + iy = R e i% = C Q*(t)V(t) dt. (3-67) 
Jo 

These components x and y are Gaussian random variables, for they are the results 
of linear operations on the Gaussian random processes X(t) and Y{t), which are the 
quadrature components of the input, v{t) - Re V{t) exp tilt, V(t) = X(t) + iY(t). 
Under hypothesis Ho their expected values are zero, for E[V(t)\ Ho] = 0. Their 
covariances can be calculated in the following way. First averaging the absolute 
square of (x + iy) and dividing by 2, we obtain 

\E{\x + i>| 2 | Ho) = \[E{x 2 \ Ho)] + \[E{y 2 \ H )} = \E{zz*\ H ) 

= i f f Q*{t)Q(u)E[V{t)V*(u)\Ho]dt du (3-68) 
Jo Jo 

= f f Q*(t)Q(u)$(t,u)dtdu= f Q*(t)S(t)dt =d 2 ; 
Jo Jo Jo 

here we have used (3-32), the integral equation (3-52), and the definition (3-53). 
Similarly, by (3-31), 

E{(x + iyf\ Hq] = E[x 2 -y 2 + 2ixy\ H Q ] = E(z 2 \ H Q ) 

= f f Q*(t)Q»(u)E[V(t)V(u)\H Q ]dtdu = 0, 
Jo Jo 

which implies that the expected values of x 2 &ndy 2 are equal and that E(xy \ Ho) = 0. 
From (3-68) it then follows that 

Varx = Var>> = d 2 , (3-69) 

and Cov(x,y) = 0. The components x and y, as Gaussian random variables, are 
therefore statistically independent, and their variances are equal to d 2 . Under hy- 
pothesis Hq their joint probability density function is thus 



Po(x, y) ~ 
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The probability Q of a false alarm equals the probability that the point with 
coordinates (x,y) lies outside a circle of radius i? , and it can be calculated by 
integrating the joint density function p (x, y) over the exterior of that circle: 



Qo = Pr[* 2 + y 2 > R 2 Q \ H Q ) = 



1 



2Tld 2 



R>Ro 



■ r A- 2 +j 



dx dy 



d 2 



~°°R e- R2/2d2 dR = exp 

Ro P \ 2d 2 



(3-70) 



The integral was evaluated by changing to polar coordinates. Incidentally we have 
shown that the probability density, function of the statistic R under hypothesis Hq is 



R 

d< 



P Q (R) = % exp| ~ | £/(*), 



2d 2 ) 



(3-71) 



which is known as the Raykigh distribution. 

When under hypothesis H\ there is a signal ^(f; \p) having a particular phase 
*|i present, the components x and y in (3-67) are again Gaussian random variables 
with covariances given by (3-69). The expected value of the complex envelope V{t) 
of the input is now S(t) exp and the expected values of the components x and 
y are given by 

E(x + iy\ H u i|/) = e'* f Q*{t)S(t) dt = d 2 <?'+, 
k 

E{x\ H u i|/) = d 2 cos i/ b = c? 2 sin 

The joint probability density function of x and y under hypothesis H t is therefore 



>i(x,j';*!>) = 



1 



2nd 2 
1 



o exp 
exp 



(x - d 2 cos <|j) 2 + (j; - d 2 sin ip) 2 ] 
2d 2 J 

a- 2 + 7 2 - 2d\x cos ji + y sin t|i) + of 4 
2d 2 



} 



The probability that the signal s{t; \\f) is detected is the probability that the 
point with coordinates (x, y) lies outside the circle of radius R under hypothesis 
Hi: 



&(+) = Pr[x 2 + y 2 > i? 2 | Hi] = 



R>R» 



Pi(x,y; t|/) dfy. 



To evaluate this integral, we introduce polar coordinates x ~ R cos 6, y = R sin 0. 
The element of area is dx dy = R dR dB. Using the definition (3-61) of the modified 
Bessel function /(>(*)> we find 



1 r 00 r2n r ^2 _ 2(l 2 R cos(0 _ ^ + 

S^U ReXP [- 2d^ 

i r n r R 2 + dn 



(3-72) 
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Figure 3-4. Probability density function g(a, x) of the output of the detector. 
Curves are indexed with the value of a. 

The probability density function of the statistic R under hypothesis H\ that 
one of the signals s(t; is present is therefore 

= i,(,/.f), (3 73) 

q(a, x) = x e-^ 2+a %(ax)U(x). 

This density function q(a, x) has been plotted in Fig. 3-4 for a few values of the 
parameter a. The curve for a = represents the Rayleigh distribution; for a > (3- 
73) is called the Rayleigh-Rice or the Rice-Nakagami distribution. For large values 
of a the density function looks much like that for a Gaussian distribution. Indeed, 
with the asymptotic formula 

/o(*) » x » 1, (3-74) 

for the modified Bessel function [Abr70, p. 377, eq. 9.7.1], the density function 
q(a, jc) becomes 

r x i I/2 

q( ~ a ' ^ * L2^J exp HO ~ a ) I' 

Averaging the conditional probability density function P\(R\ with respect to 
any distribution of the phase yields the density function of the statistic R, 



ax » 1. 
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Pi(R) = ~ exp 



R 2 + d" l 
2d 2 J 



Dividing by the probability density function of R under hypothesis H Q , the Rayleigh 
distribution in (3-71), we obtain the likelihood ratio for the statistic R, 

MR) = e-' ,2/2 I Q (R), 

which agrees with the average likelihood functional A[v{i)] in (3-60), as indeed it 
must, for R is a sufficient statistic. 

The integral in (3-72) cannot be evaluated in closed form. We put 

0/010 = Q{d, Ro/dl (3-75) 
expressing it in terms of Marcum' s Q function, 

/•CO 

Q(a, (3) = x e'^ 1+a ^I (ax) dx. (3-76) 

This function has been extensively tabulated by Marcum [Mar50], and its properties 
have been studied by Rice [Ric44] and Marcum [Mar48]. In particular we note the 
initial values 

Q(a,0) = 1, 2(0,3) - <T pV2 . (3-77) 

Various properties of the Q function and algorithms for computing it are to be found 
in Appendix C. 

When the Ney man-Pears on criterion is being used, the decision level Rq on 
the statistic R is picked so that the false-alarm probability Q Q in (3-70) equals the 
preassigned value. The probability Q d (ty) of detecting the signal s(t; ip) is then given 
by (3-75); it is independent of the phase i|/ of the signal that happens to be present. 
This detection probability Q d is a function of the parameter d in (3-53) or, when the 
noise is white, in (3-65); as before, we call d 2 the signal-to-noise ratio. 

The signal-to-noise ratio d 2 required to attain a given probability Q ci of detec- 
tion for a fixed false-alarm probability £ is slightly larger when the signal phase i|i 
is unknown than when it is known. If we denote the former by d[, the latter by el$, 
the ratio d)/d measures how much larger the signal-to-noise ratio must be when 
the phase \\f is unknown than when it is known in order for the receiver to achieve 
the pair (Q , Q d ); we call this ratio the loss entailed by not knowing the phase ijj of 
the signal a priori. In Fig. 3-5 we plot this loss in decibels, that is, 10 \o^{d 2 /dl\ 
versus log ]0 Qq for various values of the probability Q el of detection. Over most of 
the range the loss does not exceed 1 dB. 



3.5 NARROWBAND SIGNALS IN COMMUNICATIONS 
3.5.1 The Binary Incoherent Channel 

Digital communications most often utilize binary signals transmitted at a constant 
rate, and both the theory of coding binary data and that of detecting binary signals 
are among the most extensively developed parts of communication theory. Let us 
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log 10 Q 

Figure 3-5. Signai-to-noise ratio loss in decibels entailed by ignorance of the 
phase of the signal to be detected, versus Iog 10 Qq. The curves are indexed with 
the probability Qd of detection. 



suppose that messages to-be dispatched have been coded into a stream of O's m 
Vs. For each the transmitter sends a signal received as sq(i) and for each 1 a sipi.i 
received as The relative frequencies of O's and Vs are £o and respeciiy'- 
and Co + £i = 1. One speaks then of communicating over a binary channel. 

In treating the detection of these signals we suppose that the elements oi 1 1<- 
sequence of O's and 1 's that they represent are statistically independent. Each recc i \ ■ ■ 
signal is confined to an interval of T seconds' duration, and there is no intersyn 1 1 
interference. With both the noise and the sequence of symbols taken as stalioit.n 
random processes, all intervals are statistically alike, and we need to consider onk 
single interval (0, T), On the basis of its input v{t) during that interval, the recciui i 
to choose between two hypotheses, Ho, "Signal so(t) was received," and Hi, c *Si»-n.. 
s\(t) was received." When it selects Ho, the receiver issues a 0; when it selects // 
a 1. The receiver is to be designed so that the relative frequency of errors in ii> 
stream of digits it puts out — that is, the probability of error in each decision- is 
small as possible. 

When the transmitted signals are narrowband pulse modulations of a aim. 
of frequency XI, the received signals and s\(t) under the two hypotheses can i- 
taken to have the forms 

sj(t) = Re Sj(t) exp(iClt + ity), j = 0, 1. (3 > 
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i ii. ii phases and 4* i may be unknown at the receiver for several reasons. The 
MiMuiucr may be pulsing a high-frequency oscillator with no attempt to keep the 
i'.iM-N of the output coherent from one signal' to the next. Transmission over a 

■ iiilvr of paths of different and variable lengths or rapidly varying delays in the 
: "l ,;i :''ition of the signals from transmitter to receiver may change the phases of the 

■ .-ju\1 signals in ways the receiver cannot follow. Synchronization with the phase 
■i i !u- iransmitted carrier may simply be too costly, and the designer may choose 
■■ -iiMcgard phase relations between successively received signals. We speak then of 

■ '•in-rent detection. 

In the phases i|f and ty\ of the received signal we assign the uniform prior 
■i "lability density function 

zty) = ~, < < 2ir, j ~ 0, I. 

i h-- noise is taken as stationary, narrowband, and Gaussian with autocovariance 
hi. ■lion 

4>(t) = Re $( T )e' nT . 

" "ulcr to minimize the probability of error, the receiver selects the hypothesis with 
! " ! '"^uer posterior probability, given its observation of the complex envelope V{t) 
i us input v(t) = Re V(t) exp iSlt. If as in Sec. 3.3 the receiver has taken a set 
V 2 , V„) of complex samples of V(t) by means, for instance, of a set 
■i lunciions /;■(/) orthonormal over (0, T) as in (2-39), then according to (1-4) it 
i". oses hypothesis H\ if UPiOO ^ £q£o(V); otherwise it chooses H Q . Here pj(\) is 
ii-- inim probability density function of the real and imaginary parts of the samples 
miiicr hypothesis Hj. As in (3-58), 



pj(y) = C pj(y; j = °' 1 (3_79) 

li- ic /;,(V; vj/) is the joint probability density function of those real and imaginary 
■m is when the signal sj(t) is on hand and the phase of its carrier is ty. If we introduce 
l"- dummy hypothesis H H that no signal at all is present, the receiver can just as 

■ -'li compare the quantities 

Pniy) p n (y) 

■ in ic /)„( V) is the joint probability density function of the data V when noise alone 
pivscnt; it is given by the right side of (3-47). Passing to the limit of an infinite 

inmbcr of samples as we did in Sec. 3.3, we see that the receiver can compare the 
jiuniilies 

£oAo[K(/)] and £iA,[K(0], 

■■Ik-iv by (3-59) and (3-60) 

MV(0) = exp(-±df)I (Rj), j = 0, 1, 

o QjWO) dt, df = J T s/(/)0(O dt, (3 " 80) 



Rj = 
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with Qj(t) the solution of the integral equation 

Sj{t) = fi(t - u)Qj(u)du, < r < 7\ ; = 0, 1, (3-81} 
Jo 

as in (3-52). Thus the receiver compares the two quantities 

y = ^ exp(-^ 2 )/ (i?o) and yi = & exp(4«f?)Jo<J?i). 

The receiver determines the statistic Rj, j = 0, 1, at the end of each observation 
interval (0, T) by sampling the rectified output of a filter matched to the signal 
Re Qj(t) exp iflt. 

If yo > y\, the receiver chooses hypothesis Ho and issues a 0; if ya < y\, it 
issues a 1. Setting yo-y\, one can compute a monotone function f(Ro) such that 
the receiver selects hypothesis Hi when R\ >f(Ro); otherwise it selects Ho. The 
false-alarm probability is then 

J -00 /-CO 
dRA po(Ro,Ri)dR u 
Wo) 

where pq(Rq, -Ki) is the joint probability density function of Rq and R\ under hy- 
pothesis H\ . If the signals are orthogonal in the sense that 

J. CO fCO 
S *(m(t)dt = S?(t)Q Q (t)dt = 0, 
o Jo 

Ro and R\ are independent random variables, and 

by (3-71) and (3-73). Then 

ft**, Zf{M H ) = expj^f^l^ j, 
and the false-alarm probability is 

»-'*rH-3-"9 e F~ 

The false-dismissal probability Q\ can be expressed by a similar integral; both must 
be evaluated numerically. The overall probability of error is then 

3.5.2 The Balanced Binary incoherent Channel 

If the 0's and l's occur equally often, £ = £i = \\ and if the received signal-to- 
noise ratios are equal, do = d\ - d, the binary channel is appropriately termed 
balanced. When the noise is white, both signals are being received with equal energies, 
Eg = Ei - E. There is complete symmetry between them. The receiver can then 
simply compare R Q with R\, deciding that symbol was sent if R$ > R\ and that 1 
was sent if Rq < R\. 
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The probabilities of errors of the first and second kinds arc equal in the bal- 
anced channel. A rather long calculation, presented in Appendix D, shows that the 
probability of error is 

Pe = Pr(*o > *il #1) = M) - ^ <l2/4 Uj\Md 2 ) 

= ~ m<0 + Qi^ici, iL 2 d)], (3-82) 

where X is the correlation coefficient of the two signals, 

1 i rT 



l r 7 ' 

e; one - w)ei(w) dtdu = j 2 j e *(/)5,(o <// 



and g( ■ , ■ ) is the g function defined in (3-76). If \ = 0, the signals are orthogonal 
with respect to the interval (0, T) and the kernel <j>(? - u), whereupon the probability 
of error becomes simply 

Pe = Q(0,2" ]/2 d)-\e- cl2/4 = \e~ dl/ \ 
3.5.3 The Unilateral Binary Incoherent Channel 

In an on-off binary communication system, s (t) = 0, and the receiver uses the same 
decision scheme as developed in Sec. 3.3.2, comparing the rectified output R of a 
filter matched to the signal Re[£>i(0 exp itlt] with a decision level that we here call 
r , and it issues a 1 when R > ?- and a when R < r . From (3-60) we see that the 
decision level r G is given by the equation 

exp(-irf?) / (r ) = ^ (3-83) 

in terms of the relative frequencies Co and ^ of 0's and Ts; df is as in (3-80) the 
signal-to-noise ratio of the received signal when a 1 is transmitted. 

The average probability of error in this unilateral binary incoherent channel is 

Pe = to Pr(* > r \ Ho) + £i ?r(R < r \ //,), 
and by (3-70) and (3-76) we can write this as 

Pe = Jo^ + W ~Q(d u b)l = 

d\ 

withg(-, ■) again Marcum's Q function. By using (C~l 1) and (3-83) this expression 
can be simplified to 

Pe = iiQ{b,d\). (3-84) 

When l = £, = ±, pulses and blanks are being sent equally often, and the 
average signal-to-noise ratio, which is proportional to the average transmitted power, 
is dl w = \d x . In comparing the unilateral and the balanced channels, we should 
equate their average signal-to-noise ratios and set - d 2 , where d 2 is the signal- 
to-noise ratio in Sec. 3.5.2, for the average power expended by their transmitters will 
then be the same. When the noise is white, \d 2 = E/N, where E is the energy of 
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Figure 3-6. Error probabilities for 
the unilateral and balanced incoherent 
binary channels versus average signal-to- 
noise ratio d 2 /2. Solid line: unilateral 
channel; dotted line: balanced channel. 

each signal received in the balanced channel and N is the unilateral spectral density 
of the noise. In Fig. 3-6 we have plotted the probability P e of error for each of these 
channels versus \d 2 . The unilateral channel has the smaller probability of error. 

3.5.4 The incoherent TA-ary Channel 

Suppose that the transmitter is sending messages coded into an alphabet of M sym- 
bols, each invoking a different narrowband signal. The received signals are then 
narrowband pulses modulating a radio-frequency carrier, 

Sj(t) = Re Sj(t) exp(iflr + ity), j = 1, 2, ... , M, 

and when no attempt is made to track the phase of the received carrier, the phases 
ipy must be treated as random variables. It is most reasonable to assign them the 
uniform distribution over (0, 2ir). The receiver is to decide which of these M signals 
is present during an observation interval (0, T). 

This problem is an extension to M hypotheses of the binary decision problem 
treated in Sec. 3.5.1. Again a useful dummy hypothesis H n states that noise alone is 
present, and by carrying through the same analysis as in that part, we find that the 
optimum receiver decides that that hypothesis H s is true for which 

bkjWA = tj txp(~ l jdf)I (Rjl j = 1, 2, ... , M, (3-85) 

is largest; the statistic Rj and the signal-to-noise ratio df are defined as in (3-80). 
The receiver embodies a bank of M filters in parallel, each matched to one of 
the signals Re Qj(t) exp itlt, where Qj(t) is the solution of the integral equation 
(3-81). Each filter is followed by a linear rectifier, whose output, sampled at the 
end of the observation interval (0, T), is the statistic Rj. The receiver computes by 
analog or digital means the M likelihood functional A/[v(t)] weighted with the prior 
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probabilities and determines the largest. In general it will be difficult to calculate 
the error probability of such a communication system. 

If the signals occur equally often, = AT 1 , arrive at the receiver with equal 
signal-to-noise ratios df = d 2 > and are orthogonal in the sense of (2-46), that is, 

f T s*(t)Qj«) dt - d%, 
Jo 

the receiver simply decides that the yth signal is present when Rj > R h Vz £ j. 
When the noise is white, the signals must be orthogonal in the usual sense of (2-39), 
and df = 2Ej/N, where £} is the energy of the y'th signal and N is the unilateral 
spectral density of the noise. 

To calculate the probability Q c of a correct decision, we assume that the first 
signal is the one received, and as in (2-133) we find by (3-73) 

Qc = PrCRi > Ri, i = 2, 3, ... , M\ H\) 

fOO 

= *(1 - e~ x2/2 ) M ~ l e-^^kixd) dx. 
Jo 

We expand the factor ( • ) M ~ l by the binomial theorem and integrate term by term, 
utilizing the normalization integral for the Rayleigh-Rice distribution in (3-73)— see 
(C-4)— , and after a little algebra we find for the probability of error 

f -'-a4^ H)r (f) e ~ < "~ lwV2r ' 

r = 2 ' 

Many other types of signal sets are used in communication systems, and their 
virtues and disadvantages are discussed in texts on communication theory, where one 
can also find calculations of the error probabilities they suffer. This large subject is 
beyond the scope of this book. 



3.G TESTING COMPOSITE HYPOTHESES 
3.6.1 The Bayes Criterion 

Chapter 1 treated strategies for choosing between two hypotheses on the. basis of a 
number of measurements, strategies that in essence select which of two probability 
density functions p Q (v) or p)(v) is the more consistent with the observed values 
v\, i> 2 , ••• ,f« of n random variables. The choice is to be optimum under some 
criterion — Bayes or Neyman-Pearson — corresponding to a definition of long-run 
success. It was assumed that the two probability density functions are known in all 
respects. In Chapter 2 this theory was applied to the detection of a unique signal in 
Gaussian noise. 

Signals to be detected, however, are rarely unique; seldom is the form with 
which they appear at the receiver completely known. Usually it is necessary to detect 
one of a class of signals specified by parameters taking values anywhere in more or 
less well-defined ranges. A narrowband radar echo, for example, has the form 
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s(t; A, t , ft) = A Re S(t - to) exp i£l(t - / ). 



Its amplitude A depends on the size and reflectivity of the target; the same target in 
different orientations may reflect the radar pulse with widely different amplitudes. 
The arrival time to depends on the distance to the target, and the carrier frequency 
ft, through the Doppler effect, depends on the component of the velocity of the 
target in the direction of the receiver, None of these three quantities may be known 
in advance with much precision. A radar system must be designed to detect echoes 
with a spectrum of values of A, to, and ft. In a communication system utilizing 
quasiharmonic signals A Re F(t) exp(/ftz + it is only under the most carefully 
controlled circumstances that the amplitude A and the phase iji of the carrier are 
known by the receiver. The carrier frequency ft may also vary, as when an en- 
emy, in order to hinder our intercepting his messages, changes his carrier frequency 
ft from pulse to pulse in a way known to his receiver, but not to ours. In this 
chapter we have analyzed the detection of narrowband signals of unknown carrier 
phase Now we must consider how to adapt the general theory to accommodate 
uncertainties in any parameters of the signals to be detected. 

Denote the unknown parameters of the signal by 0j, (b, ... , 6 m , supposing 
there to be m all told. We can represent them by a vector = (61, 02, ... , 6 m ) in 
an m-dimensional parameter space, which we designate by ®. Under hypothesis Hi 
these parameters appear in the probability density function p\(v\ 6) of the data v 
through its dependence on the signal s{t; 0). Hypothesis Hi now asserts that one of 
a class of signals s(t; 0) with parameters G © is present; it is said to be a composite 
hypothesis. Hypothesis Ho, we shall presume for the most part, states that only noise 
is present; and if the statistical characteristics of the noise are completely known, 
hypothesis Ho remains what is termed a simple hypothesis. Under Ho the probability 
density function of the data is po(v). 

The task of the receiver is to decide whether the values of the data v actually 
measured were drawn from a population described by the probability density func- 
tion po(v) (hypothesis Ho) or from one described by a probability density function 
pi(v\ 0) for a set of parameters G ® (hypothesis H\). Again the decision strategy 
can be described as a division of the n-dimensional Cartesian space R n of the ob- 
servations v into two regions Ro and R\ . Hypothesis Hq is chosen when the point 
whose coordinates are the observed values v - (v\,V2, ... , v n ) lies in region Ro, and 
H\ when it lies in R\ . The decision surface D dividing these regions is to be selected 
so that the statistical test is optimum in some sense. Ideally, but rarely, all prior 
probabilities and costs are well defined and the Bayes criterion is applicable, a situa- 
tion treated by Wald [Wal50], whose statistical decision theory was applied to signal 
detection by Middleton and Van Meter [Mid55]. The observer knows not only the 
prior probabilities £0 and £j with which hypotheses H and H\, respectively, hold, 
but also a joint prior probability density function z(0) = z(9i, 62, ■•• , 9m) of the m 
parameters 0, which describes their relative frequencies of occurrence when hypoth- 
esis Hi is true. As with all joint density functions, its integral over the parameter 
space equals 1: 





(3-86) 
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The observer must in addition know the costs Coo and Ci of choosing hypotheses 
H and H\, respectively, when H is true, and the costs C |(8) and C n (6) of choosing 
Ho and Hi, respectively, when H { is true and a signal with parameters 6 is present. 
These costs may depend on the parameters of that signal. 

The average cost per decision is now, as an evident modification of (1-14), 



C = £o 



Qo Po(v) d"v + do po(v) d 



d n v d m Bz^)Coim P i(p\B) 

Rn J® 



(3-87) 



d n v d"'Qz(Q)C n (Q)p ] ( 



v\ 6) j. 



The first bracket of (3-87) is the risk associated with hypothesis H ; the second is 
that associated with 7/i._The decision surface D separating the regions Rq and R\ 
must be situated so that C is minimum. The analysis proceeds as in Sec. 1.2, where 
the Bayes strategy for a choice between simple hypotheses was derived, and it shows 
that the decision surface D consists .of those points v satisfying the equation 



Co(Cio - C 0Q )p Q (v) = U | d m B z(O)[C i(9) - C u {B)]pdv\ 6), 
Jo 

v G D. 



(3-88) 



Those points v for which the left side of (3-88) is the larger make up Rq; those for 
which it is the smaller make up R\. To choose between hypotheses H Q and H\ the 
observer calculates the cost-likelihood ratio 

A = ii Je^'"e ^(e)[Coi(e) - c^o)]/^! e) 

on the basis of the observations v and decides for hypothesis Hq if A f < 1 and for 
Hi if A, > 1. 

When, as we assume henceforth, the costs Coi and Cn do not depend on 
the values of the parameters 8, we can carry out the integration over in these 
expressions, and by introducing the overall probability density function 



P\{v) = 



z(Q) P M 0) d"'Q 



of the data under hypothesis Hi we can reduce our expressions to the same form as 
in Sec. 1.2. The observer now forms the average likelihood ratio 



~Mv) = ^ 
po(v) 



d'"Bz(B)Mv\ 0), 



AM 6) = >iMf 



and compares it with the quantity 



Ar 



io{C\o — Coo) 



l\(Co\ - C\\) 

as before; if A(v) < A 0) hypothesis H Q is chosen, otherwise H\. 



(3-89) 



(3-90) 
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The Bayes cost, which is the minimum value C m i n of (3-87) obtained when the 
decision surface is given by (3-88), depends on the prior probabilities £o and £1 and 
on the form of the prior probability density function z(8). If these are unknown, 
but the costs are defined, one might seek_those prior probabilities £o and £1 and 
that prior density function z(6) for which C m in is maximum, assuming perhaps that 
some adversary is picking them so as to make the observer's minimum loss as large 
as possible. The form of the prior density function 2(8) that with the proper values 
of £0 and l\ maximizes the Bayes cost C m m is called the least favorable distribution 
of the parameters 6. In a few problems it can be found by inspection; in others 
it may be most difficult to calculate. The concept of a least favorable distribution 
will appear again when we discuss the Neyman-Pearson criterion. For an extensive 
treatment of the Bayes criterion in problems of conventional statistics, we refer to 
[Bla54]. 

If both hypotheses H and H\ are composite, prior probability density functions 
20(8) and zi(8) of the parameters under both hypotheses must be specified. If the 
costs are independent of the true values of the parameters, the optimum decision 
under the Bayes criterion, as it is not hard to see, is made by comparing the likelihood 
ratio 

Mv) = Pj(v) = f z0)pj{v\ 8) <T8, j = 0, 1, (3-91) 

PoiP) J© 

with the same decision level Ao as in (3-90). Again unknown prior density functions 

z,-(8) might be replaced by least favorable ones if these can be discovered. 

Suppose that the receiver is to decide, on the basis of its input v(t), which of 

two signals s Q (t; 8) and s\(t; 8) has been added to the random noise n(t) to form 

the input v(t). If we divide pj(v) in (3-91) by p n {v\ which is the probability density 

function of the data v when no signal at all is present, the likelihood ratio can be 

written 

J& (3-92) 
j = 0, 1. 

Here Aj(v\ 8) is the likelihood ratio appropriate to the decision between two simple 
hypotheses, Hy. "Signal sj{t; 8) is present in the midst of random noise n{t\" and 
H„: "Noise n(t) alone is present." When the noise is Gaussian, we can as in Chapter 
2 pass to the limit n — » 00 of an infinite number of samples of the input, whereupon 
the likelihood ratio Aj(v\ 8) goes into a likelihood functional Aj[v(t)\ 8] similar to 
that in (2-71). The decision can now be based on the quantity 

A[v{t)} = £rr^ AWO] = f z,-(B)A,-W)l 8] <T6, j = 0, l. (3-93) 
A [u(/)] J& 

Similar passages to the limit n — *■ oo of an infinite number of data can be carried 
out in the other formulations of this section. Observe that we average over the 
parameters 8 before dividing to form the ratio A[v(t)] in (3-93). An example of this 
method has been seen in Sec. 3.5,1, where we averaged over the phase ij; of each 
signal s } (t;\l?),j = 0, 1. 



A(.) - A ^ 



Aj(v\ 8) 



_ pM e) 
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3.6.2 The Extended Neyman-Pearson Criterion 

We return to the choice between the simple hypothesis H , "No signal is present," 
and the composite hypothesis H u "One of a class of signals s(t; 0) with parameters 
8 G © is present." Once the decision strategy has been adopted as a dichotomy of 
the data space R n into regions R and R], the probability Q of an error of the first 
kind, or false alarm, is 

Go = po(v) d"v, (3-94) 
Jit\ 

R\ being the portion of H n in which hypothesis Hi is chosen. The probability of 
detecting a signal s(i; 0) with a particular set 6 of parameters is 

f pi{v\*)d»v. (3-95) 

Having adopted the Neyman-Pearson criterion, one would like to find a deci- 
sion surface D separating regions R G and yielding the maximum probability of 
detecting a signal with any set of parameters, and incurring the preassigned false- 
alarm probability Qo. If the same surface D is thus optimum for all values of 
the parameters 6, the strategy so defined provides what is called a uniformly most 
powerful test of hypothesis H] against hypothesis Hq. 

Here is an example of a uniformly most powerful test. Let the n observations 
v be normally distributed, independent random variables with variance ct 2 , and let 
their expected values be zero under hypothesis H and m > under H\. Then the 
joint probability density functions of the data are 



p (^) = (2Wr /2 exp^|;^ 



Of the parameter m all that is known is that it is positive. For any fixed value of m, 
according to Example 1-2 in Sec. 1.2.4, the optimum decision strategy is equivalent 
to comparing the sample mean 

i " 

with a fixed critical value M and picking hypothesis H when V < M and H x when 
V > M. This decision level M is completely determined by the probability Qq of an 
error of the first kind, 

/-CO 

Q<s = Po(V)dV, 

JM 

where P Q (V) is a Gaussian density function with expected value zero and variance 
<j 2 /n. The decision surface D is now the hyperplane 

n 

^ v k = nM, 
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and it is independent of the true expected value m under hypothesis H\ . The test 
is therefore uniformly most powerful when only positive values of m are possible; it 
can be used in ignorance of the actual value of m. 

It is the exception rather than the rule for the decision surface D maximizing 
of (3-95) for fixed <2o to be independent of the parameters 0. In the vast 
majority of problems the same surface D will not be optimum for all values of 0, 
and a uniformly most powerful test does not exist. This is so, for instance, when the 
true expected value m in our example can be either positive or negative. 

If the prior probability density function 2(0) of the parameters is known, 
the extended Neyman-Pearson criterion requires that, for preassigned false-alarm 
probability Qq, the average probability of detection 

f*(e)&(e)«re (3-96) 

be maximum. With this prior density function, the overall probability density func- 
tion of the data v is 

Pi (v)= J 2(0)^,(^0)^0, (3-97) 
J© 

and the average probability of detection is 



Q d = pi(v)d"v. (3-98) 

The test in effect chooses either po(v) or p\{v) as better representing the data at hand, 
and as in Sec. 1.2 the optimum strategy forms the likelihood ratio 

and decides for H\ if A(v) > Ao and for Hq if A(v) < Ao, with the value of Ao fixed 
by the preassigned false-alarm probability 

go = Pr[A(w) > A | #o3 = f-Po(A) dA. 

The average probability of detection is then 

Q d = Pr[A(z>) > A | = CpdA)dA. 

As in Chapter 1, Pq(A) and P\(A) are the probability density functions of the 
statistic A{v) under the two hypotheses. Just as in Sec. 3.6.1, A(v) of (3-99) is an 
average likelihood ratio and can be expressed as in (3-89). If, as with the detection 
of a known signal in Gaussian noise, we can take the likelihood ratio A(v\ Q) to the 
limit n — * co to obtain the likelihood functional A[u(Ol 6] for the detection of the 
signal s(t; 0), the decision that is optimum under this broadened Neyman-Pearson 
criterion can be based on the average likelihood functional 

A[v(t)] = f 2(0)A[y(/)l0]rf m e. (3-100) 

In the problem of detecting a narrowband signal of unknown phase in Gaussian 
noise, treated in Sec. 3.3.2, the unknown parameter is = and there we accepted 
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the prior density function z(i)j) = (2ir) _I for it. The average likelihood functional 
(3-100) is then given by (3-60). In Sec. 3.4 we calculated the probability Q d ($) of de- 
tecting a signal with a particular phase i|f and found it given by (3-75) independently 
of 

What is to be done when past experience provides no guidance to selecting the 
prior probability density function 2(8)? The most prudent course would seem to be 
to adopt that prior density function z(8) for which the maximum average probability 
of detection 

QdW = f P\{v)d n v - f d"v f z(fyp s (v\ 6)rf m e (3-101) 

is least: 

QAn^QAA* Vz(6). (3-102) 
The decision region R } [z] in (3-101) contains those points v for which 



A(v) = 



Pq(v) 



with Ao such that the false-alarm probability 

Qo[z] = I po(v)d"v (3-103) 

takes on the ^reassigned value. The prior probability density function 1(6) defined 
by (3-102) is termed the least favorable distribution of the parameters 8 with respect to 
the Neyman-Pearson criterion. In Sec. 7.6 we shall develop a general criterion that 
determines whether a given prior probability density function z(8) is least favorable 
in this sense. 

3.6.3 Detection of Signals of Unknown Amplitude 

Let us denote the amplitude of the signal by A and assume it positive, A > 0; 
the remaining unknown parameters of the signal, including possibly its sign, are 
designated by 8 and lie in a parameter space ©. The amplitude A will be taken 
to be statistically independent of those other parameters 8, as will be the case in 
most detection problems; the joint prior probability density function of the signal 
parameters then has the form z A (A)z(Q). We consider various ways of coping with 
ignorance of the amplitude A of the signal to be detected. We adopt the extended 
Neyman-Pearson criterion that for a certain preassigned false-alarm probability the 
probability of detection, averaged with respect to the prior distribution z(8) of the 
parameters other than amplitude, shall be maximum. That prior distribution z(8) 
may have been selected on the basis of past experience, or it may be the least favorable 
distribution defined at the end of Sec. 3.6.2. 

It may be that the detection strategy that is optimum in this sense for a signal 
with a given amplitude A turns out not to depend on the value of A. For example, 
the strategy determined in Sec. 3.3.2 for signals with a uniformly distributed random 
phase \Jj can be put into a form independent of A. Writing the signal as 

s{l\ty) = A Re F(t) e ia ' +i '\ 
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we easily see that the strategy is equivalent to comparing the sufficient statistic 

r = \[ T G*{i)V{t)dt 
(Jo 

with a decision level ro; here V(t) is the complex envelope of the input and G{t) is 
the solution of the integral equation 

F(t) = f ^{t,s)G(s)ds i < t < T, (3-104) 
Jo 

with $(r, s) the complex autocovariance function of the noise. A signal s(t; is 
declared present if r > ro, and ro is determined by the false-alarm probability, which 
as in (3-70) is 

Qo = «p|-2^], d' 1 = ^G*(t)F(t) dt. 

The probability Qd(A, \|») of detecting a signal with amplitude A and phase \|i is by 
(3-75) 

QM^) = Q[d/ji), d = Ad', 

in terms of Marcum's Q function (3-76) and the signal-to-noise ratio d 2 defined as 
in (3-53). A test that, like this one, does not require knowing the amplitude A of the 
signal, yet is optimum under the extended Neyman-Pearson criterion for a signal 
with arbitrary amplitude A, can be said to be uniformly most powerful with respect 
to amplitude. 

When a test that is uniformly most powerful with respect to amplitude does 
not exist with the prior probability density function z(0) adopted for the remaining 
parameters, the receiver may be designed to be optimum for a particular value A s 
of the signal amplitude lying somewhere in the range of expected amplitudes A. We 
call A s the standard amplitude. For signals of other amplitudes A ^ A s the receiver 
will not be optimum, but the loss of signal detectability will seldom be serious. 

Alternatively, a prior probability density function z A (A) may be adopted on 
the basis of some physical model of how a radar signal is reflected from its target 
or how a communication signal propagates from transmitter to receiver, as when 
the channel fades in some manner. The parameters of the prior density function 
z A (A) may themselves be only roughly known, and standard values of them must be 
accepted for receiver design. 

When, for instance, the signal strength undergoes Rayleigh fading, its amplitude 
A is a random variable governed by a Rayleigh distribution, 

z(A) = ~ exp^-^ U(A). (3-105) 

Such a signal can often be conveniently considered as having the form 

s(t) = Re[aF(t) e iat ], (3-106) 

in which a = a x + ia y is a complex signal amplitude with a x and a y statistically in- 
dependent Gaussian random variables with zero expected values and equal variances 
s 2 ; A - \a\. The phase = arg a is then uniformly distributed over (0, 2tr). 
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The threshold or weak-signal approximation is based on the viewpoint that in 
the least favorable situation for detecting a signal, the signal is very weak, and a 
receiver optimum for the weakest signals will be adequate for stronger ones. The 
average likelihood ratio is now defined by 



A(v) = 



za(A) 



z(Q)A(v\A,Q)dAd m - l to. 



(3-107) 



The likelihood ratio 



A( | A, 9) = "MA^l 



for detection of a signal s(t; A, 0) with known parameters is expanded in a power 
series in the amplitude A: 



A(v\ A, 6) = 1 + AA A {v\ 0, 6) + \A 2 A AA {v\ 0, 6) + 



in which 



A,(*|0,e)= aA <" U ' 8 > 



dA 



A A a{v\ 0, 8) = 



3 2 A(H A, 0) 



dA* 



A=0 



and so on. When this series is substituted into (3-107) and the integrations over the 
amplitude A are carried out, we find 



A(v) = 1 + A \ z(B)A A (v\ 0, 6) d m ' l B 
Jo 



+ ±A 2 



z$)A A M o, e) d™- 1 e + 



(3-108) 



where 



A k = 



A za(A) dA 



is the kth moment of the prior probability density function of the amplitude A. The 
threshold statistic is the nonvanishing coefficient 



d"'- ] Q 



A=0 



(3-109) 



of lowest order in this series. In most problems, with proper definition of the 
amplitude A, this will be the term with k = 2. When appropriate, the likelihood 
functional A[v(t)\ A, 6] figures in the definition of the threshold statistic, the limit 
n — ► oo of an infinite number of samples of the input v(t) having been taken. The 
threshold statistic g e is compared with a decision level set to yield a preassigned 
false-alarm probability [Mid60a, Sec. 19.4], [Rud61], [Mid66]. Several examples of 
the threshold statistic will be derived and analyzed in the sequel. We shall find it 
appropriate mainly when the receiver can base its decision on a large number of 
statistically independent inputs, whereupon the input signal-to-noise ratio required 
for it to attain practical values of the false-alarm and detection probabilities is indeed 
small. 
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3*6*4 Maximum-likelihood Detection 

In some situations the threshold approximation may emphasize signals so weak that 
there is no hope of detecting them anyhow, and the designer should look for superior 
detection strategies for signals strong enough to be detected with a useful probability. 
When the signal is strong, the dominant contribution to the likelihood ratio (3-89), 
averaged with respect to all the parameters 8, comes from the neighborhood of that 
point m of the space @ for which A(o| 6) is maximum, 

A(z>i e m ) s k(v\ 8), ve *e m , 

and the average likelihood ratio A(v) is then approximately proportional to that 
maximum value. The maximum-likelihood strategy compares the maximum value 
with a decision level A] chosen to yield the preassigned false-alarm probability. This 
strategy will be studied in connection with detecting a signal of unknown time of 
arrival, as in radar, and it will be found superior to the threshold detector for signals 
with useful detection probabilities. 

As an example of maximum-likelihood detection, let us consider a spread- 
spectrum communication system. In order to deceive an enemy, a transmitter of 
messages coded into binary digits and 1 utilizes one of M different carrier fre- 
quencies Cl m for sending the pulses representing the l's in a message; it sends no 
signal at all for the O's. The receiver knows in advance the sequence with which 
the several carrier frequencies ft m will be selected. This is a simple version of a 
spread-spectrum system. 

If we are the enemy, however, and ignorant of that sequence, we are confronted 
with the problem of detecting a signal of the form 

sj(t) = Re Sj(t) exp(;ft,-/ + zif/), 1 < j < M, (3-1 10) 

in which the carrier frequency ft, might take on any one of M possible values. 
These frequencies we assume to be so far apart that the signals Sj{t) are for all 
practical purposes orthogonal, The phase i[> is unknown and distributed uniformly 
over (0, 2tt). The frequency ft/ can be thought of as an unknown parameter taking 
on only one of a finite number M of discrete values. Suppose that we have observed 
that when l's are transmitted, frequency ft; is used with relative frequency — or prior 
probability — zj, 1 < j < M. These prior probabilities sum to 1: 

M 

The receiver must thus choose between two hypotheses, 

(Ho): v(t) = n(t), 

(HO: v(t) = n(t) + *,(/), 

where the index j may take any value from 1 to M with conditional probabilities zj. 
Let us assume that n(t) is white Gaussian noise with unilateral spectral density N 
and that the input is as usual observed during the interval (0, T). 
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If v denotes a set of n samples of our input v(t\ the receiver must form an 
average likelihood ratio 



M 



./=! 



Po{v) ' 



(3-111) 



where p G (v) is the joint probability density function of the samples v under hypothesis 
Hq that only random noise is present. If we pass to the limit of an infinite number of 
samples as in Sec. 3.3, we see that the receiver must generate the average likelihood 
functional 

_ M 

A[v(t)} = ^ZjAj[v(j)1 (3-112) 

where Aj[v(t)] is the likelihood functional for detecting the y'th signal sj(t) among 
those given in (3-110). 

Let us adopt a reference carrier frequency fV in the neighborhood of the carrier, 
frequencies of the signals. With respect to this reference frequency, the complex 
envelope of the y'th signal is Sj(t) exp[f(fl ; - tl r )t + /# According to (3-54), as 
written for a signal having this complex envelope and received in white noise, the 
y'th likelihood functional is given by 



A>(/); »[f] = exp 



Re 

N 



rT 



S'U) expH(^/ - iV)/]K(/) dt 



2N) 



\Sj(l)\ 2 dt 



before averaging over the phase if;. If we now average over < \|j < 2tt, we find as 
in (3-60) the average likelihood functional 



where 



Aj[v(!)] = exp(-\df)I Q (Rj\ 

4 2 = yV(Ol^ 



(3-113) 
(3-114) 



is the signal-to-noise ratio of the y'th signal, and as in (3-80) 

rT 

5/(/)exp[-i(ft / - ~a r )t]V{t)dt 



R; = ~ 
1 N 



The statistic Rj is the rectified output, sampled at the time / = T, of a filter matched 
to they'th signal sj{t) of (3-1 10); the pass frequency of this narrowband filter equals 
fly. The functional in (3-113) is substituted into (3-112) to determine the average 
likelihood functional, which the receiver must compare with a decision level 



An = 



£o is the relative frequency of 0's, and £i is the relative frequency of Ps in the messages 
being transmitted. This receiver requires us to know both the amplitudes of all the 
signals being utilized and their relative frequencies. Calculating the probability of 
error it incurs would be extremely difficult because of the nonlinear manner in which 
the input v(l) is processed. 
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If the various signals are utilized equally often, Zj.= M'\ if the signal-to-noise 
ratios of the signals at the input are the same, df s d 2 , and if the received signals 
are strong enough to be detected with a low probability of error, there will be one 
term in the average likelihood functional 

_ l M 

A[v(t))= -~e~ d2/2 ^Io(Rj) (3-115) 

that under hypothesis H\ is much larger than the rest, and this is most likely to be 
the term corresponding to the signal actually transmitted. This term will dominate 
the sum in (3-115) because of the rapidly accelerating increase of the modified Bessel 
function I${r) with increasing r. 

A receiver that is nearly as effective as the optimum receiver is therefore one 
that bases its decision on the largest of the M rectified outputs at time t = T. That 
is, the receiver decides that a 1 was transmitted if 



max rj > ro, 



where 



7 = f = _i |f r s/(,) eX p H (a ; -a r )t]V{t)dt 



1 < j < M, 



is proportional to the rectified output of they" th matched filter, and ro is the decision 
level. (We have normalized the output in this way for convenience in subsequent 
calculations.) If all the data rj lie below r , the receiver decides that no signal was 
transmitted at any of the M frequencies, that is, that the message digit is 0. If any 
datum rj exceeds r Q , the receiver chooses hypothesis Hi . We can call this a maximum- 
likelihood receiver. The maximum-likelihood receiver is appropriate whenever under 
hypothesis H\ one of M orthogonal signals of equal energy may appear, but it is 
unknown a priori which one it will be. 

By the same analysis as in Sec. 3.4 we find that the density functions of each 
datum Yj under the two hypotheses are 

poirj) = rj exp(4r, 2 )E%) (H ) (3-116) 

when no signal is present and, in terms of the Rayleigh-Rice density function in 
(3-73), 

P\(Tj) = g(dj, rj) = rj expHO-/ + d?)]l (djrj)U(r;) (Hi) (3-117) 

when a signal sj(t) with effective signal-to-noise ratio df, defined as in (3-114), is 
present. 

The false-alarm probability of the maximum-likelihood receiver is 1 minus the 
probability under hypothesis Ho that all the rectified outputs r, fall below the decision 
level ro: 

Qo = 1 - [1 - cx V (~^i)] M . (3-118) 

Because all signals are being treated alike, and all have the same input signal-to-noise 
ratio df s d 2 , we can calculate the probability Q\ = 1 - Qd of missing the signal 
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under the assumption that under hypothesis H\ the first signal s\(t) is present By 
(3-75) 

Pr(n < r | #0=1- Q(d,r ), 

where again Q(-, ■ ) is Marcum's Q function (3-76). Furthermore, because the 
signals are assumed orthogonal, the distributions of the rectified outputs of the 
filters matched to the other M - 1 signals are the same as in (3-116), and 

?v(rj < r \ H u j > 1) = 1 - exp(~i/- 2 ), 

and therefore 

Gi = I - firf = [1 - Q{d, r )][I - exp(-^ 2 )]^ ] (3-119) 

is the probability of false dismissal. The value of the decision level r Q for minimum 
error probability is calculated by forming the error probability 

differentiating with respect to r , and setting the result equal to zero. The ensuing 
equation, which we leave for the reader to write out, must be solved by trial and 
error. 

For a receiver based on the Neyman-Pearson criterion, the decision level r 
must be determined from (3-118) for the preassigned false-alarm probability Q : 

exp(-W) = l-(l-e ) IAl '. 

The signal-to-noise ratio D M required in order to attain a detection probability 
is then found by solving 

Q(Dm, r ) = 1 - (1 - Q d )[\ - exp(4r 2 )]'~ W 

The loss in signal detectability caused by the uncertainty about which of the M 
signals might, be present can be measured by the ratio D 2 M :Di, where D\ is the 
required signal-to-noise ratio when it is known which signal will be present under 
hypothesis H\ : 

In Fig. 3-7 this ratio, expressed in decibels as 10 log 1D (Z>A,/Z>?), has been plotted 
versus the number M for a false-alarm probability Q = 10 -4 and for three values 
of the probability Q (1 of detection. (M is of course an integer, but the computed 
points have been connected by a continuous curve for the sake of clarity.) The loss 
is seen to be well under 2 dB even for M as large as 1000. 

This receiver, as we shall see in Sec. 7.2, furnishes an approximate model of 
a maximum-likelihood receiver for detecting a signal of unknown arrival time, as 
in radar. The unknown parameter 8 will initially take on a finite set of values 
corresponding to the centers of brief intervals into which the entire observation 
interval (0, T) will have been divided. The signal, when present, may appear during 
any one of these subintervals, but in which is unknown a priori. When this receiver 
is treated from the standpoint of the extended Neyman-Pearson criterion, it requires 
no knowledge of the amplitude of the signal, the decision level r being determined 
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Figure 3-7. Loss Dm/D z in decibels when any of M orthogonal signals with 
equal signal-to-noise ratio might be present under hypothesis H\. go - 10"" 4 . The 
curves are indexed with the probability Qd of detection. 

by the false-alarm probability, as in (3-118). It will be found superior, under ordinary 
conditions, to the threshold receiver based on the concepts described in Sec. 3.6.3. 



3-1. Consider a memoryless nonlinear device whose output vo(t) at any time / is a function 
. g( - ) of its input v{t) at the same time /: 



If the input is an amplitude-modulated signal s(t) = S(t) cos XI/, the output will be a 
sum of harmonics of the carrier frequency O: 



Show how to calculate the functionals Gk[S(t)] by means of the Fourier series for 
the periodic function g(S cos x), -tr < x < it. Evaluate them for the linear full-wave 
rectifier, g(v) - \v\, and for the quadratic rectifier, g(v) = v 2 . 
3-2. Let a rectangular narrowband signal 



be impressed on the simply resonant circuit of Fig. 3-3. Calculate the complex envelope 
of the output by means of (3-16). Take the carrier frequency of the signal different from 
the resonant frequency of the filter by an amount that is arbitrary, but on the order of 
the bandwidth of the filter. 



Problems 



fo(r) = g(v(t)). 



CO 



MO = X Gk[S(t)] cos kQt. 




o < t < r, 

t < 0, 
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3-3. Let <t>(-r) be the complex autocovariance function of a stationary narrowband Gaussian 
random process x(t) of expected value 0. In terms of <J>(t) find the autocovariance 
function of the output of a quadratic rectifier to which x(r) is applied. The rectifier 
forms [x(t)f and completely attenuates the components of twice the carrier frequency 
Hint: Use (3-44). 

3-4. Let <j>fr) be the complex autocovariance function of narrowband Gaussian noise. For 
two different times t\ and t 2 let the complex envelope of the noise be Rj exp iQj = 
X(tj) + iY(tj)J = 1, 2, Show that the joint probability density function of the ampli- 
tudes R\ and R 2 is 



R\ + R 



2a 2 (l - 



*(0) 



^JH^r^)} 



ffinf: Convert (3-40) for « = 2 to the joint probability density function of R U R 2 ,Q U 
and 2 and integrate out the phases 9 U 9 2 [Mid48]. 
3-5. For noise of the same kind as in Problem 3-4 find the probability density function of 
the phase difference i]/ = e 2 - 9 ( by integrating the joint probability density function 
found in that problem over R\ and R 2 instead of over the phases [Mid48]. Hint: 
Change variables to (z, /), where fl, = crz cos t and R 2 = crz sin /, and integrate first 
over < z < oo and second over < / < it/2. Answer: With a = \r\ cos(i|/ - (3) and 
P = arg r, 

p(M = (27r)- ! (l - H 2 )(l - « 2 r V2 [VT^ + fl (l7r + sin' 1 «)]. 

3-6. A sinusoidal signal of amplitude A is added to Gaussian narrowband noise of mean- 
square amplitude o- 2 = $(0). Show that the phase 9 of the sum, measured with respect 
to that of the sinusoid, has the probability density function 

P&) = ^ e "" V2 + cos 9 f H" 2sil,2 °[l - erfc(« cos 9)] 



with a - A/a. Work out a Gaussian approximation to this probability density function 
when a » 1 [Mid48j. 

3-7. Let z\_ - a'i + iyi and z 2 = x 2 + iy 2 have circular Gaussian distributions as in (3-39) 
with <j>u- = 1, k = 1,2. These random variables are all statistically independent and 
have expected values zero. Find the probability that 

l-i I + Ui\ > a, a > 0. 

3-8. Show that the likelihood ratio in (3-60) is equal to dQ,//dQ , where & and Q d are the 
false-alarm and detection probabilities calculated in Sec. 3.4. 

3-9. Carry out the derivation of (3-84) from the previous expression for the error probability. 
3-10. In an Af-ary balanced channel one of M orthogonal signals is transmitted every T 
seconds with relative frequency l/M. The signals are received with energy E in white 
Gaussian noise of spectral density N, but with an unknown phase that can be taken as 
uniformly distributed over (0, 2u). There is a possibility that fading might destroy the 
signals, and to indicate this a null zone is provided in the decision mechanism [Blo57]. 
The rectified outputs of filters matched to each of the signals are compared at the end 
of each interval (0, T), and the receiver sends to the decoder the symbol corresponding 
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to the filter whose rectified output is the greatest, except that if all the outputs fall below 
a certain amplitude a, the receiver indicates an "erasure." Calculate the probabilities 
that the transmitted signal is correctly received, that an erasure is indicated, and that 
an incorrect symbol is sent on to the decoder. 
3-11. The signal s(t; A) = A Re F(t) exp(i£lt + ;i|i) is to be detected in narrowband Gaus- 
sian noise having a complex autocovariance function $(f, h). The input is v(t) = 
Re V(t) exp tilt, and the observation interval is (0, T). The signal is subject to Rayleigh 
fading, so that its complex amplitude 

a = Ae i<f ~ a x + ia y , 

as in (3-106), is a circular complex Gaussian random variable with 

Var a x ~ Var a y = s 2 ; 

E(a) = 0. Write the conditional likelihood functional \{v(t)\ a x , a y ] for detecting this 
signal in the Gaussian noise in terms of the circular complex Gaussian random variable 

z = ^G*(t)V(t)dt, 

where G(t) is the solution of the integral equation (3-104). Now average the likelihood 
functional with respect to the random variables a x and a y to determine the average 
likelihood functional A[u(r)] defined by (3-100), in which 9 - (a x , a y ). Show how this 
detection problem is equivalent to that in Example 1-3 with n = 2 and 

fT 

N = 4=\ G % (t)F(t) dt, Ni=N + $s 2 . 
,Jo 

One can always normalize the signal amplitude so that dl = 1 . 
3-12. Let the amplitude A of a narrowband signal A Re F(t) exp(ifif + i'\|j) be distributed 
according to the Rayleigh distribution in (3-105), and let the phase i}/ be uniformly 
distributed over (0, 2-n). It is to be detected in the presence of white Gaussian noise of 
unilateral spectral density Find the optimum detection statistic and relate its deci- 
sion level to the critical value A associated with the Bayes criterion (1-17). Calculate 
the minimum Bayes cost CmmC?) as a function of s and investigate its behavior as s 
approaches 0. Show that as s vanishes, for Ao > 1, C min (0) - C mio (j) approaches 
faster than any power of the average signal-to-noise ratio. Take 

fV(0l 2 ^ = 1 
Jo 

without loss of generality. 
3-13. In the incoherent Af-ary channel treated in Sec. 3.5.4, suppose that the amplitude of 
the signal is subject to Rayleigh fading, so that the complex amplitude of the/th signal 
has the form 

5,(0 = AFj(t), < t < T, 

where the complex envelopes Fj(_t) are orthonormal over the interval (0, T). The am- 
plitudes A have a Rayleigh distribution as in Problem 3-11, with the variances s l the 
same for all signals. Assume that the signals are received in white Gaussian noise with 
unilateral spectral density N. Find the optimum receiver for deciding which of the M 
equally likely signals is present in its input, and calculate its probability of error. Hint: 
One way to do this is to express the likelihood functionals in (3-85) in terms of A and 
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average with respect to the prior probability density function z(A), using the normal- 
ization integral for the Rayleigh-Rice distribution in (3-73). It is simpler, however, to 
use the result of Problem 3-11. 

3-14. Evaluate the performance of the maximum-likelihood receiver treated in Sec. 3.6.4 
when the signals s k (i) of (3-110) are subject to Rayleigh fading, their complex ampli- 
tudes a x + ia y having the circular Gaussian distribution in Problem 3-1 1 with Var a x - 
Var a y ~ s 2 . 

3-15. Carry through the analysis of the system described in Problem 3-10 under the same 
assumption as in Problem 3-13 that the signals are subject to Rayleigh fading. 
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4 



Detection in Multiple 
Observations 



4.1 OPTIMUM DETECTOR 

Radar detection of a fixed target at a given range was treated in Chapters 2 and 3 as 
a matter of deciding whether an echo signal of specified form is present in the input 
v(t) to a receiver during a certain interval (0, T) after an electromagnetic pulse has 
been transmitted toward the location in question. How the receiver should process 
its input v{t) in order to make that decision most efficiently was described there. 
Usually, however, a radar sends more than one pulse in the direction of a possible 
target, and the presence of a target is indicated not by only one signal, but by a train 
of echo signals appearing at the input to the receiver. If there is no target, the input 
contains only noise. 

Denote by v k (t) the input to the receiver during the T-second interval following 
transmission of the kth pulse; the time t will be counted in each interval from its 
beginning. Suppose that the decision about the presence or absence of a target is to 
be based on the returns from M transmitted pulses, k = 1, 2, ... , M. Denote the 
echo signal in the kth interval by Sk(t). The receiver must then choose between two 
hypotheses, 

v k (t) = n k {t), ^ _^ 

H\: v k {t) = s k {t) + «*(?), < t £ T, £ = 1,2, ... , M, 

where is the noise during the kth interval. We shall assume that this noise 
is white and Gaussian, with unilateral spectral density JSfc in the kth interval. Our 
analysis could easily be extended to colored noise, provided that the intervals are 
separated sufficiently so that the noise in one interval is independent of that in any 
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other, bui to do so would not be particularly instructive. The reader will perceive 
how our equations need to be modified in order to account for noise that is not 
white. 

Hypothesis H x will in general be composite; the signals s k (t) may depend on 
parameters such as amplitude and phase that are known only imprecisely. These pa- 
rameters may even vary randomly from one input to another. Suitable assumptions 
about their prior probability density functions will have to be made. Hypothesis H , 
on the other hand, we shall assume at present to be simple. Although the noise 
spectral densities N k may differ from one input to another, we presume they are 
known. 

In a binary communication system transmitting information coded into O's and 
l's, a 1 may be dispatched by sending a signal in each of M successive intervals; 
these signals are received as s k (t), k = 1, 2, ... , M. For the O's nothing is sent in 
any interval. The receiver will then be confronted with a hypothesis-testing problem 
of the type of (4-1). Alternatively, the transmitter might, for each 1, send a quasi- 
harmonic signal in each of M well separated frequency bands. These signals would 
be received as 

s k {t) = Re S k {t) exp(itt k t + ity k ), (4-2) 

where CL k is the carrier frequency at the center of the kth band. The receiver must 
again choose between hypotheses H and //, as in (4-1), but now the inputs v k {t) 
on which its decision is based are observed simultaneously, ■ 

v k {t) ~ Re V k (t) exp ifl k t, < t < T; 

V k (t) is the complex envelope of the input in the kth frequency band about carrier 
frequency Cl k . When the inputs are simultaneous in this way, we assume -them far 
enough apart in frequency so that their noise components n k {t) are all statistically 
independent. In diversity communications, signals are thus sent at different frequen- 
cies simultaneously in order to combat fading that may cause the transmissivity of 
the medium to vary randomly, but independently, in different frequency bands. Di- 
versity techniques have been described and analyzed by Stein [Ste66] and Kennedy 
[Ken69]. 

In sonar detection acoustic echoes are picked up by an array of transducers that 
convert them to electrical signals. If v k (t) represents the output of the kth transducer 
of an A/-element array and if s k (t) is the component of the signal induced in that 
transducer by the acoustic echo, the receiver must carry out a hypothesis test of 
the same type as in (4-1). The method by which it combines the inputs v k (l) in 
order most effectively to detect echoes from a target in a given direction is called 
beamforming. We shall treat it in an elementary way in Sec. 4.3. In Sec. 4.4 we 
shall describe a method for comparing the performances of two different , ways of 
processing M independent inputs to the receiver when M is very large. Section 
4.5 introduces the subject of distributed-detection systems in which a number of 
independent sensors search for a common signal and transmit their binary decisions 
to a central processor that makes the final decision as to its presence or absence. 

In what follows we shall analyze the hypothesis test of (4-1) under various 
assumptions about the signals s k (t). We shall begin by assuming them completely 
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known, and later we shall permit them to have phases that are either identical, 
but random, or independently random from one signal to another. Both the design 
of the receiver and — insofar as possible — the calculation of its performance will be 
presented under these various conditions. 

4.1.1 Complete Coherence 

When the signals Sk(t) are known in all respects, the optimum detector can be 
specified by a straightforward extension of the results of Chapter 2. Samples of 
any one input tfc(/) are independent of those of any other input because we assume 
statistical independence of all the noise components n k (t). The joint probability 
density functions po(v) and p\{v) of all the samples therefore factor into products 

M 

Pj{v)=Y\pf\v), y =0,1, 

of the probability density functions _pf\v) of samples taken in the several inputs 
v m (t), m = 1, 2, ... , M. The likelihood ratio therefore also factors 



into a product of the likelihood ratios of the M individual inputs. When we pass to 
the limit of an infinite number of samples of each input, we find that the likelihood 
functional for the set {ifc(f)} of inputs factors as a product 



M 



of the likelihood functionals of each input. These are given by (2-74), and putting 
them together we write the overall likelihood functional as 

The decision between hypotheses H and H\ is therefore optimally based on the 
sufficient statistic 

M 1 rT 

8 = y~~\ s m {t)v m (t)dt. (4-3) 

The mth term of this sum is the output, at the end of the observation interval (0, T), 
of a filter matched to the mth signal s m (t\ weighted inversely with the strength N m 
of the noise at its input. 

As in Chapter 2, the statistic g is a Gaussian random variable under each 
hypothesis, and the reader can easily show that the false-alarm and detection prob- 
abilities are 

Qo = erfcx, Q d - erfc(x - D), 
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where the effective signal-to-noise ratio D 2 is given by 

rT 

.0 



d 2 = y ?5s 

Z- A/ 



(4-4) 



This quantity D 2 , furthermore, is the maximum signal-to-noise ratio attained by any 
linear combination 

g' = X a >» s m {t)v m {t) dt (4-5) 

»! = ! " 

of the outputs of the matched filters. For this reason, the weighting used in (4-3) is 
called maximal-ratio combining in studies of diversity communication systems. 

When the signals are quasiharmomc with carrier frequencies Sl m and phases 

s m (t)~ Re S,„(l) exp(i£l m t + ity„) t 1 < m < M, (4-6) 

we can write down the likelihood functional by putting Q{t) = N~ ] S m {0 exp ity m 
into (3-54), and we obtain the functional 



1 



Re V exp(-^ m ) 



S*(t)V m (t)dt 



M 1 r T 1 



(4-7) 



of the complex envelopes V„(t) of the M inputs: 

fm(0 = Re K,„(0 exp fft m /, 1 < m < M. 

When the signals s k (t) contain only a single common, but unknown phase 
tyk s iji, and the phase of the carrier of one signal relative to that of any other is 
known to the receiver, the signals are said to be completely coherent. If we assign to 
that common unknown phase i|i the uniform distribution over (0, 2it), the average 
likelihood functional becomes 



with 



R - 



M , 
Z- AT 



rT 



S*(t)V m (t)dt 



(4-8) 



In order to combine inputs associated with different carrier frequencies ft Ml 
they must first be brought to a common intermediate frequency (i-f) ft . This can 
be done by mixing the mth input with a locally generated signal 2 cos(n,„ - ft )/ 
by means of a multiplier followed by a filter that discards the components in the 
product having carrier frequency 2Ct m - ft : 
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2v m (t) cos(fi„ - Oo)t 

= \[V m {t)e ia "' 1 + V:(t)e- in »> ! ] ■ [e^~ na)! + e -'( fi <« ""<>>'] 

= \[V m {t)e iSkt + V*(t)e- iilo! + V ni (t)e i(2Sl -- ao) ' + OO^" 1- ^'] 

-> Re V^Oe^' . 

The process is the same as that in a heterodyne receiver. 

These inputs, now "at i-f," are passed through filters matched to the signals 
Re S m (t) exp i£l t. The output of the mth filter is weighted by A^ 1 , and the out- 
puts are added, after being brought into simultaneity by appropriate delay lines if 
necessary. The weighted sum is passed through a linear rectifier, and its output is 
sampled at what corresponds to the time t = T to produce the statistic R. It is 
essential that the relative phases of the signals one to another be known precisely if 
the inputs are to be combined in this manner before rectification; the signals must 
be completely coherent. 

By the same analysis as in Sec. 3.4, the false-alarm and detection probabilities 
for this receiver are 

Qo = e-* /2 , Qd = Q{D,b) (4-9) 

in terms of Marcum's Q function (3-76), with the signal-to-noise ratio D 2 given by 
(4-4), in which the energy of the mth signal is now 

E m = {[ T \S m {t)\ 2 dt. 
Jo 

4.1.2 incoherent Signals 

A radar receiver is to decide whether a target is present at a given range on the basis 
of the returns from M transmitted pulses. By 

v k (t) = Re V k {t)e ia \ l < k < M, 

we denote the input to the receiver following the &th transmitted pulse, gated in 
such a way that the interval (0, T) just encompasses the arrival of an echo from the 
range in question. The echo signal in the &th interval will be 

s k (t) ' Re S k (t) exp(iH* + ity k ). (4-I0) 

If the receiver fails to synchronize its local oscillator accurately with the transmitted 
pulses or if the target moves erratically over distances on the order of or greater 
than a wavelength of the radiation between one observation interval and the next, the 
phases t|j,t of these echoes will be independently random. We assume them uniformly 
distributed over (0, 2ir). The signals are then said to be completely incoherent. 

Under hypothesis H no target is present, and the M inputs v k (t) contain only 
white Gaussian noise of unilateral spectral density N. Under hypothesis H\ they 
contain in addition the quasiharmonic signals in (4-10) with independently random 
phases ^>t - The likelihood functional for deciding between H and H\ is then given 
by (4-7) after it is averaged over the phases ty m . If for convenience we utilize the 
normalized variable z> defined by 
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fiNjEj Jo 

the average likelihood functional becomes 

M rT J 1 



»;=i Jo 2lT 



This average likelihood functional must be compared with a decision level A , which 
under the Neyman-Pearson criterion is chosen to yield a preassigned false-alarm 
probability. Alternatively, the receiver forms the sufficient statistic 



M 



V = £ In h (d m r m ) (4-13) 

and compares it with 

M 

r = InAo + \D\ D 1 = £ d} n . 

m=i 

This detection statistic F can be generated by amplifying the output of the matched 
filter during the mth interval by a factor proportional to the mth signal strength d m 
and applying it to a rectifier whose characteristic is In I Q (x), by which we mean that 
its output is In I (\W(t)\) when its input is Re W{t) exp i9j. The outputs of the 
rectifier are sampled at the end of each interval, the samples are added, and their 
sum is compared with the decision level r . If T > r , the receiver decides that a 
train of echoes has arrived. 

The form of the rectifier characteristic y = In I (x) is shown in Fig. 4-1. For 
small values of x, 

1 ' 1 
64' 



In /„(*) - -x 2 - -~x 4 + 0(x 6 ), (4-14) 



and for large values, by (3-61), 

In f (x) = x - I ln(2ir.v) + ~- + 0(x~ 2 ); (4-15) 

see (C-16). For small values of d m r m the optimum detector uses a rectifier that 
is nearly quadratic; for large values it is nearly linear. This" type of detector was 
derived by Marcum [Mar48], and its use has been discussed by Woodward and 
Davies [Woo50], Middleton [Mid53], Fleishman. [Fle57], and others. Because of the 
nonlinearity of the function In I (x) it is very difficult to determine the false-alarm 
and detection probabilities for the receiver utilizing the statistic F of (4-13). 

This receiver depends rigidly on the values of the signal-to-noise ratios d} n , 
1 <m < M. Even if they are all equal, it is necessary to know their common value. 
The system cannot provide a test for hypothesis H versus hypothesis Hi that is 
uniformly most powerful with respect to the amplitudes of the signals. 
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Figure 4-1. Optimum rectifier characteristic, y = In fo(x). 



4.2 THE THRESHOLD DETECTOR 

4.2.1 The Weak-signal Approximation 

If in the detection of M incoherent signals, treated in Sec. 4. 1 .2, the signal amplitudes 
are unknown, there are three courses the designer can follow. The first is to pick 
a typical set of signal-to-noise ratios </* as specifying a standard set of signals for 
which the detector is to be optimum. The form of the detector is then given by 
(4-13). The probability of detecting signals with some other set of parameters d^ 
will be less than it might have been, had the M values of d% been known in advance, 
but the loss of detectability will not in most situations be serious. 

A second course is to choose a joint probability density function z{d\,d 2i ... , 
d m ) for the signal-strength parameters d m and to have the receiver base its decisions 
on an average likelihood functional 



A= ddA dd 2 ... dd M z(d u d 2} ... y dM)Y\h(d k r k )txp(-{d 2 k l (4-16) 



If the signals are communication signals that have passed through a fading channel 
in which their amplitudes fluctuate randomly, such a joint prior density function 
may be derivable from the nature of the fluctuations. 

Suppose, for instance, that the channel suffers independent Rayleigh fading 
from one input to the next and that the expected signal-to-noise ratio is the same in 
each input: 




M 
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M 

z(d u d 2 , ...,d M )~Y\ z(d k ), 



k = \ 



,~ d Id 2 
z{d)~ -exp|~ 



(4-17) 



U{d). 



The integration in (4-16) can be carried out by means of the normalization integral 
for the Rayleigh-Rice distribution (3-73), or we can more easily use (3-106) and the 
method of Problem 3-11 to show that the average likelihood functional is 



A - (1 + * 2 )-' v/ exp 



2(1 + j2) 



M 



(4-18) 



where the r> are, as in (4-12), 



r k 



1 



S* k {t)V k {l)dt 



in terms of the complex envelopes S k {t) of the signals, their energies E k , and the 
unilateral spectral densities N k of the noise in the several inputs. These statistics r k 
are independent of the amplitudes of the signals. A sufficient statistic is now 



M 



(4-19) 



It can be formed by passing each input Re V k (t) exp iCU through a filter matched to 
the signal Re S k (0 exp /fh, which is followed by a quadratic rectifier whose output 
is sampled at the end of the observation interval (0, T); the M samples are then 
appropriately weighted and summed. 

A third approach for the designer is to assume the least favorable situation of 
very weak signals and adopt the threshold statistic defined in (3-109). The product 
of exponential and Bessel functions in (4-16) is expanded in a power series in which 
only terms of first order in the signal-to-noise ratios are retained, 



h{d k r k ) &Kp{-\d k ) 



1 + Wl ~ 2)d 2 k , 



by (3-61) and the series for the exponential function, and 

M M 

11 hVkTk) exp(-44 2 ) » I + JX dM ~ 2). 



Putting this into (4-16) and denoting by (d k ) the expected value of the kth signal-to- 
noise ratio,, we find for the average likelihood functional the approximate form 

A * 1 + \Y. - 2), <d*> = d 2 k z (d k ) dd k . 

This is called the weak-signal or the threshold approximation. 

If the signals to be detected are the M repetitions of a pulse conveying the 
symbol 1 of a binary message and if these signals all arrive with the same average 
strength in noise of the same spectral density N, the average signal-to-noise ratios 
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(rf|) will be the same, and the average likelihood functional depends on the inputs 
only through a threshold statistic U that has the same form as that in (4-19). This 
also serves as the threshold statistic in radar when the target and the antenna are 
stationary, so that all echoes provide the same average signal-to-noise ratio. The 
receiver is said to integrate M filtered and quadratically rectified inputs. 

If, on the other hand, the signals are echoes from a fixed target of constant 
reflectivity past which the antenna is rotating, the average signal-to-notse ratios {df) 
will be proportional to a function /(e) representing the combined gains of the radar 
antenna on transmission and reception: 

(dl) = (d 2 )AQ k - e„), (4-20) 

where 6* is the azimuth of the antenna at the instant the A:th echo is received, and 
6o is the azimuth of the target. The values of the beam factor /(8), if known, could 
be used to calculate an improved detection statistic of the form 

M 

U' = ^m-Q Q )rl (4-21) 

which requires knowing more about the signals than does the sum of squares in 
(4-19). If the energies of the received echoes are truly proportional to f(Qk ~ $o) 
as in (4-20), the statistic U' will detect the train of signals more reliably than the 
statistic U of (4-19). 

4.2.2 Performance of the Quadratic 
Threshold Detector 

Not only the threshold detector for quasiharmonic signals with independently ran- 
dom phases, but also the optimum detector for such signals when subject to inde- 
pendent Rayleigh fading is based on the sum U of the sampled outputs of quadratic 
rectifiers, as in (4-19). We now calculate the false-alarm probability Qo and the 
probability Qj of detecting a train of M signals with given phases and amplitudes: 

00 = Pr(t/ > U \ #o), Qd = Pr(t/ > U \ H\\ 

where Uq is the decision level with which U is compared. The latter probability 
will be worked out first; the former follows directly from it when we set the signal 
strengths equal to zero. 

The statistic U in (4-19) is the sum of M random variables \rl that are indepen- 
dent because the noise in one input is statistically independent of that in any other. 
The probability density function of the sum U of a number of independent random 
variables is most easily determined from its characteristic function E(e\p iv>U\ H\) 
or its moment-generating function 

h(z) = E{e~ zU \ H X ) = r Pl {V)e-* u dU, 

J-03 

which is the Laplace transform of the probability density function p\(U). For the 
statistic U in (4-19) 
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h\(z) = E exp 



and because the random variables 



are statistically independent, we find 

M 

h\{z) = f] E[**9[-{*ixj + yf)]\ Hi] (4-22) 

7=1 

Here x } and yj are the real and imaginary parts of the circular complex Gaussian 
random variable zj defined in (4-1 1). 

The random variables xj and y } are independent and Gaussian, with unit vari- 
ances and expected values given by 

E(zj\ Hi) = E( Xj + iyj\ Hi) = dj exp 

that is, 

E(xj \H\) = dj cos % , £(y; I Hi ) = 4 sin % . 
Their joint probability density function has the circular Gaussian form 

PdXj, yj) = ?\(zj) = J- exp[-|j2y - ^ exp i%| 2 ] 



2tt 

-L exp{-i[(x y - rfy cos %) 2 + (yj- dj sin ifc) 2 ]}. 



(4-23) 



The moment-generating function of \xf is therefore, as in (4-22), 

1 f 00 

£[exp(-izA- 2 )] - _ j jxp[-\{ X j - dj cos %) 2 - \ixf\dxj 



+ z \ 2(1 + z) J' 



and the moment-generating function of \yf has the same form, but with cos 2 % 
replaced by sin 2 Putting these into (4-22), we find for the moment-generating 
function of the statistic U 

Mz > = <rhw ^-207)] (4 - 24) 

where D 2 is the total signal-to-noise ratio as in (4-4). 

The probability density function of V is the inverse Laplace transform of (4-24) 
and can be found from a table to have the form 

/{/\(M-D/2 

Pi(U) = {jj e- y - 5 /A,-i(2v^ST7)t/(£/), S = \D 2 , (4-25) 

[Erd54, vol. 1, p. 197, eq. (18)], where I M -i(x) is the modified Bessel function of 
order M - 1. This density function depends only on the total energy-to-noise ratio 
S = £> 2 /2 and not on how the energy in the received signals is divided among them. 
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The probability Q& of detecting a set of signals of the form (4-6) with indepen- 
dently random phases t(f m and effective signal-to-noise ratios d* is then 

Qd = \*Pi(V)dV = Q m {D,yl2lk) (4-26) 

JVa 



27) 



Wo 

in terms of the Afth order Q function 

Qm{^ 0) = J p x (-) exp[-*(* 2 + a. 2 )]I M -\(oix) dx, (4- 

with Z> 2 the total signal-to-noise ratio defined in (4-4). The Afth-order Q function 
generalizes Marcum's Q function defined in (3-76) and is related to it through 

Q M (a, p) - £(«, 3) + e-l**** 2 X ( fi Y/r(«P). 

Recursive methods for computing the detection probability Qd are outlined in the 
Appendix, Sec. C.3. These recursive calculations are laborious, however, when M is 
large. Approximations and alternative methods suitable for M » 1 will be developed 
in Chapter 5. 

That the probability Qd in (4-26) is independent of the set of phases of 
the signals indicates — according to a criterion to be developed in Sec. 7.6 — that the 
uniform distribution we assumed for them is least favorable. The probability of 
detection is also independent of how the total signal-to-noise ratio D 2 is divided 
among the several signals Sk(t), a confirmation of the natural expectation that the 
least favorable distribution of the average signal-to-noise ratios is the uniform one 
Wt) 55 {d 2 ) = D 2 /M that we used in deriving the threshold statistic (4-19). 

The probability density function of the statistic U under hypothesis Ho can be 
found by taking the inverse Laplace transform of the moment-generating function 
obtained by setting D - in (4-24). Alternatively we can substitute into (4-25) the 
power series for the modified Bessel function 

^ fc!(Af- 1 + k)\ 

with x = 2jSU, after which we let S go to zero. Only the first term of the series 
remains, and we find 

po (U) = Ji e - u U(U\ (4-29) 

which is known as the gamma distribution. Integrating (4-29), we find for the false- 
alarm probability 

J 'CO M—\ Tjk 

p (U) dU = Y exp(-t/ ), (4-30) 

which is easily programmed for a calculator. Values can also be found from tables 
of the incomplete gamma function, 
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ft = 1 - /(AT^t/o, A/ - 1), /(„,,,) = 1 f" " + <fe 

/>! Jo 

[Pea34]. For the false-alarm probability Q Q - 10^, /> an integer from 1 to 12 and 
1 < M < 150, the decision level U Q can be determined from tables published by 
Pachares [Pac58], It is not difficult to program the solution of (4-30) for U Q by 
Newton's method: 

where <y(£/ ) is the function on the right side of (4-30). One can start with the trial 
value Uq = - In Q + M - L An alternative method, useful when M » 1, will be 
presented in Sec. 5.3.3. 

Figure 4-2 exhibits the detection probability Q d versus the quantity D = V25 
for a false-alarm probability Q Q - 10~ 6 and various numbers M of signals. (When 
the spectral density of the noise equals N in all inputs, S = £ r /W, where E T is 
the total received signal energy.) Additional graphs can be found in [Mar48] and in 
[DiF68]. The total energy required to attain a given probability of detection increases 
with the number M of observations: the greater the number M of incoherent pulses 
among which its total energy is divided, the more difficult the signal is to detect. 
For a given energy Ej & E per pulse, on the other hand, the probability of detection 
of course increases with the number M of signals. 

Figure 4-3 shows the loss incurred when the total signal energy E T is divided 
among M incoherent signals. Let D 2 M be the signal-to-noise ratio required to attain 
a probability Q d of detection for a false-alarm probability Qq given by (4-30). The 
loss is then defined as 10 \og lQ (D 2 M /Dl). For the curves in Fig. 4-3 the false-alarm 
probability was set at 10~ 6 . For M ~ 1000 the loss is on the order of 9 dB. 

In Fig. 4-4 we compare four different situations: (a) The M signals are co- 
herent, and the probability Q d of detection is given by (4-9). (b) The signal has a 
fixed amplitude, but appears in only one input, and the decision is made as in the 
maximum-likelihood receiver described in Sec. 3.6.4, whereupon Qd is given by (3- 
1 19) and the false-alarm probability Q Q by (3-118). (c) The M signals are incoherent; 
the total received signal energy is fixed, but arbitrarily divided among all M inputs; 
and the threshold detector of (4-19) is used, so that Q d is given by (4-26) and Q by 
(4-30). (d) The maximum-likelihood receiver is utilized, but the energy of the signal 
is divided equally among all M inputs, whereupon 

Qd = 1 - [1 - Q(M~ l/2 D, r )] M , 

with the false-alarm probability again given by (3-1 18). These detection probabilities 
have been calculated for M = 20 and Qq = 10" 6 and plotted in Fig. 4-4. 

4.2.3 Detection Probability for Rayleigh Fading 

In Sec. 4.2.1 we found that the optimum statistic for detecting the set of quasi- 
harmonic signals s k (t) when they are subject to independent Rayleigh fading is the 
sum of squares U in (4-19). The average probability Q d of detection with respect 
to the class of signals whose strength parameters d k have the Rayleigh distribution 



Sec. 4.2 The Threshold Detector 



147 




D 

Figure 4-2. Detection probability for incoherent signals of fixed amplitude. Qo = 
10~ 6 . Curves are indexed by the number M of inputs. 



can be found by averaging the detection probability Qj in (4-26) with respect to the 
joint probability density function of those parameters given in (4-17). It is simpler, 
however, to proceed as follows. 

The probability density function po(U) of the statistic U under hypothesis Ho 
was shown in Sec. 4.2.2 to be given by (4-29). The likelihood ratio for U is 



from (4-18) and (4-19). Hence as in (1-30) the probability density function of U 
under hypothesis Hi must be 
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Figure 4-3. Loss d\j/d\ in decibels when the signal energy is divided among M 
incoherent pulses; Q = 10" 6 . The curves are indexed with the probability Q d of 
detection. 



The random variable (7/(1 + s 2 ) has the gamma distribution. The average proba- 
bility of detecting these Rayleigh-fading signals is therefore 



Q, 



= 1 

£=0 



jj,k 



U ! = 



1 + 5 2 ' 



(4-31) 



and can be calculated by the same algorithm as the false-alarm probability Q Q> which 
is still given by (4-30). 

This average probability of detection is plotted in Fig. 4-5 versus the parameter 
(2Ms- 2 ) 1/2 - (D 2 ) 1 ' 2 for a false-alarm probability Q = 10 -6 and various numbers 
M of signals; (D 2 ) is the average total signal-to-noise ratio. When the number 
M of fading signals is small, the average total energy required to attain a detection 
probability on the order of 0.9 or more is much greater than the total energy required 
for signals of fixed amplitude. The larger M, however, the more closely do the curves 
in Fig. 4-5 approach those in Fig. 4-2, and the less deleterious is the effect of the 
fading on the probability of detection. 

If all the signal energy is concentrated in a single Rayleigh-fading signal, the 
average received energy-to-noise ratio s 2 necessary in order to attain a reasonable 
average probability Q d of detection is very large, as can be seen from Fig. 4-5. For 
instance, for Q Q = 10" 6 and Q d = 0.999, S = s 2 = 13, 807.6. The more indepen- 
dently fading signals among which the total energy is divided— up to a point — , 
the lower is the required average total received energy-to-noise ratio S = Ms 2 . In 
Fig. 4-6 we have plotted this energy-to- noise ratio S = (E T )/N versus the number 
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Figure 4-4. Detection probabilities for cases (a) through (d) described in the text; 
M = 20, Qo = 10"*. 

M of inputs for (2o = 10 -6 and three values of the average detection probability Q d . 
The energy-to-noise ratio is smallest when the energy is shared among from fifteen 
to thirty signals. 

Suppose that as in Sec. 3.6.4 it is known that the signal will appear in only 
one of the M inputs, but in which one is unknown, and suppose that once again the 
decision that it is present is made whenever any of the M quantities rj defined in 
(4-11) exceeds a decision level r . This we called the maximum-likelihood receiver. 
If that signal is subject to Rayleigh fading, the false-alarm probability is again given 
by (3-118), but the average probability Q d of detecting the signal is now 

Q d = 1 - [l - exp^^)] [1 - exp(-U )] M - ] , U = \rl (4-32) 

where S ~ \{D 2 ) is the average energy-to-noise ratio in that input containing the 
signal. 
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Figure 4-5. Detection probability for Rayleigh-fading incoherent signals versus 
<D 2 > X/2 ; <D 2 > is the average signal-to-noise ratio. Qo = 10" 6 . Curves are in- 
dexed by the number M of inputs. 

4.2.4 Other Types of Fading Signals 

The decision about the presence or absence of a train of signal pulses is often based 
on the sum V of the outputs of a quadratic detector (4-19), whatever the distribution 
of the fading signal amplitudes, although U is in general not the optimum statistic. 
The average probability Q d of detection can then be obtained by averaging (4-26) 
with respect to the resultant probability distribution of the quantity D 2 defined in 
(4-4). This is D 2 - 2Er/N when the noise spectral densities Nj in all the inputs are 
assumed equal to N, with E T the total received signal energy. The decision level U 
with which U is compared is still determined as in (4-30). 

The probability density function p\(U) of the statistic U under hypothesis H\ 
is not usually simple to calculate for an arbitrary type of fading. The moment- 
generating function 

h(z) = E{e-- U \H X ) 

can easily be written down, however, in terms of that of the total energy-to-noise 
ratio S - \D 2 = E T /N. In (4-24) we replace ±jD 2 by S and average with respect 
to the prior probability density function z(S) of S to obtain 
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Figure 4-6. Energy-to-noise ratio 10 log 10 S when the energy of the Rayleigh- 
fading signals is divided among M independent signals, go - 10 -6 - The curves 
are indexed with the average detection probability "Q a . 



1 



(I + z) M 



(4-33) 



h s (z) = E(e~* s ) = I z(S)e-* z dS 



= f 

Jo 



the moment-generating function of the total energy-to-noise ratio 5. 

Swerling [Swe60] described four common types of signal fading, which are 
known in the radar literature as the Swerling cases. We summarize them as follows. 

Case 1. The Af signal amplitudes fade together, their common amplitude 
having a Rayleigh distribution. A distribution of this kind arises when a radar 
pulse impinges on a complicated target and is reflected from many points thereon. 
The great number of reflected waves combine with random amplitudes and phases, 
and the complex amplitude of their sum has real and imaginary parts that are ap- 
proximately Gaussian random variables by virtue of the central limit theorem. The 
amplitude of the resultant echo then has a Rayleigh distribution as in (4-17), and 
the total energy-to-noise ratio S has the exponential distribution 



z(S) = y 



(4-34) 



where S ~ Ms 1 is the average total energy-to-noise ratio. 



Case 2. The M signal amplitudes fade independently and again have a 
Rayleigh distribution; this is the phenomenon analyzed in Sec. 4.2.3. In contrast 
to case 1, the radar target changes its position and its orientation significantly and 
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erratically during the intervals between irradiation by the transmitted pulses, and 
the complex amplitudes of successive echoes can be assumed to be statistically inde- 
pendent. Now the total energy-to-noise ratio S has a scaled gamma distribution: 

Case 3. The M signal amplitudes fade together, and the density function of 
the common individual signal-to-noise ratio dj s d has the form 

m = £- 4 e- dW U(d). (4-36) 

This density function has a narrower peak than the Rayleigh distribution in case 1, 
as will happen when a few points on the radar target scatter the incident radiation 
much more strongly than the others. The probability density function of the total 
energy-to-noise ratio is then 

z{S) = ^e-^U(S). (4-37) 

Case 4. The M signal amplitudes fade independently, the individual signal- 
to-noise ratios having the same density function as in (4-36). The density function 
of the total energy-to-noise ratio is then 

(2M - 1)1 S \ S' ) 2M K J0; 

In all four of these Swerling cases the total energy-to-noise ratio S has a gamma 
distribution with density function 

for which the moment-generating function is 

h s (z) = (l+s'zr k . (4-40) 

For the four Swerling cases, k = 1, M, 2, and 2M, respectively. By (4-33) the quad- 
ratic statistic U of (4-19) has the moment-generating function 

h(z) = (1 + zf- M {\ + bzT k , b = 1 + |. (4-41) 

The probability density functions of the threshold statistic U and the resulting de- 
tection probabilities Q d = Pr(£/ > U \ Hi) for the four types of fading represented 
by the Swerling cases are to befound in the book by DiFranco and Rubin [DiF68], 
which also presents graphs of Q d versus the average input energy-to-noise ratio for 
a number of values of M and Q Q . Appendix E modifies the recurrence methods 
of Appendix C for such fading signals. Closed-form expressions for these detec- 
tion probabilities have been given by Hou et al. [Hou87j. When M » 1, however, 
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Figure 4-7. Average probability "Qj of detection versus average energy-to-noise 
ratio 10 log lQ (S/M) for Swerling cases 1 and 3; U = 20, Qo - 10" 6 . 



computing Q d by those methods is a lengthy process. Broadly applicable alternative 
methods will be described in Chapter 5. 

In Figs. 4-7 and 4-8 we show the average probability of detection as a function 
of ~S/M in decibels for "integration" of M = 20 pulses suffering fading according 
to Swerling's four distributions. The curve on Fig. 4-8 marked oo refers to un- 
fading signals. When as in cases 1 and 3 the entire pulse train fades together, a 
much greater average energy-to-noise ratio is required than when the signals fade 
independently. 

When in radar studies such as [Mar48] and [DiF68] the performances of re- 
ceivers integrating different numbers Af of signals are compared, it is often not the 
false-alarm probability Qo that is held fixed, but the false-alarm number iVf a , defined 
as 

Afln2 .. 

= -75— C 4 " 42 ) 
Qo 

This parameter arises through the following argument. 
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0.001 - 
0.0001 



dB/pulse 

Figure 4-8. Average probability Qj of detection versus average energy-to-noise 
ratio 10 log, (5*/A0 for Swerling cases 2 and 4; M = 20, Qq = !0~ 6 . The curve 
marked » represents Qj for unfading signals. 

Consider detection of a target at a particular range by a radar that sends out a 
pulse every t, seconds; \/t T is called the pulse-repetition rate. In a total observation 
time .r b S » T r > the number of decisions made equals 

_ T bs 

for when M pulses are integrated, a decision is made only every Mr r seconds. The 
probability that no false alarm occurs in that time T ohs will be (1 - Q ) n . Now define 
the false-alarm time Tf a as the observation time r o b s within which this probability 
equals I: 



Tfa 



(4-43) 



Thus in a time Tf a the probability of having at least one false alarm equals 5. With 
<2o i» this equation becomes 

-«' ln(l - Q ) m n'Q Q = In 2 
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. T fa _ Afm2 _ 

The false-alarm number can therefore be interpreted as the number of pulses trans- 
mitted in an interval of such a duration Tf a that the probability of at least one false 
alarm equals 5. 

An actual radar searches for targets not at a single range, but at all possible 
ranges from which echoes might return during the interpulse interval iy. It is often 
considered, for reasons we shall encounter later, as in effect dividing the interval 
(0, i>) into a number L = Wt r of range bins, where W is the bandwidth of the 
radar echo. The numbers n and n' of decisions must then be multiplied by this 
factor £, so that n' in (4-43) is replaced by n' = Wt^jM. The false-alarm number 
then becomes flfo = Wi^ and can be considered as the number of range bins of 
duration l/W contained within the false-alarm time T ffl . 

In Fig. 4-9 we compare the average total signal energy needed to attain a fixed 
false-alarm number jV fa and a fixed probability Q d of detection for signals fluctuating 
as specified by the four Swerling cases. The average total energy-to-noise ratio S 
in decibels is plotted versus the number M of incoherent pulses among which the 
energy is divided. The curve marked 00 refers to signals with a fixed nonrandom 
total energy; for these, Q d is calculated from (4-26). In cases 2 and 4, the signal 
amplitudes fluctuate independently, and as M increases, the required average total 
energy-to-noise ratio S approaches that needed for nonfluctuating signals. In cases 
1 and 3, the amplitudes of all the signals in the pulse train vary together, and to 
ensurea given average probability ~Q d of detection, the average total energy-to-noise 
ratio S must be large for all values of M. 

The handbook Radar Target Detection [Mey73] contains extensive graphs of 
the average probability Q d of detection as a function of the number M of pulses 
integrated. Incoherent detection of a nonfluctuating target and detection of targets 
fluctuating according to all four Swerling cases are included. The receiver is of 
the kind treated in this chapter; it sums the quadratically rectified outputs of the 
matched filter, as in (4-19). Each figure in that book refers to a particular value 
of the parameter (In 2)/Qq = N&/M. The curves are indexed with the value of the 
energy-to-noise ratio per pulse S/M in decibels. The application to radar of the 
theory we have presented in this section is discussed at length, and an appendix lists 
the formulas used for computing the detection probabilities. 



4.3 BEAMFORMING 

4.3*1 Detection by an Array of Transducer* 

A transducer is a small receptor that responds to an incident electromagnetic or 
acoustic wave field by generating a current or voltage directly proportional to it. 
An array of such transducers whose outputs are appropriately weighted and added 
forms a kind of antenna, which might be used for picking up radar or sonar echoes 
from a distant target. The transducers, although placed close together, are assumed 
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Figure 4-9. Average total energy-to-noise_ratio in decibels required to attain false- 
alarm number AT fa = 10 6 and probability Q d = 0.9999 of detection versus the 
number M of inputs. Curves are indexed by the number of the Swerling case; oo 
refers to nonfluctuating signal amplitudes. 

not to interact or influence each other in any way that cannot be compensated for 
by the proper circuitry. 

Let the/th of M transducers in an array be located at a point iy = (xj,yj) 
in a plane, of which a one-dimensional cross section is shown in Fig. 4-10. (In this 
section we shall indicate two-component vectors by boldface type.) The output of 
that transducer during an observation interval (0, T) will be denoted by Vj(t). When 
no target is present (hypothesis Ho), 



where «, (/) is white Gaussian noise of unilateral spectral density N. It arises both 
from the input load resistor of the transducer and from broadband fluctuations in the 
surrounding medium. The resistor noise will be independent from one transducer 
to another, and the external noise is assumed to come from so broad a range of 
directions that it creates uncorrelated and hence statistically independent outputs 
from the several transducers. 

Pulses of the quasiharmonic form Re F(t) exp i£lt and carrier frequency ft are 
transmitted toward the anticipated target. The receiver is to process the outputs vj(t) 
in such a way as most effectively to decide whether a target is indeed present at a 
point (u x , u y , R) = (u, R) in a plane parallel to the transducer and a long distance 
R away. The signal induced in they'th transducer by the echo from the target is 
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Vj(t) - nj(0, 



1 <j <M t 







R 



z 



Figure 4-10. Transducer array and signal source. 

sj(t) - A ReF^t - expjm ^ - ^ + itft'j, 

where rj is the distance from the target to the yth transducer and c the velocity of 
the signal — electromagnetic or acoustic, as the case may be. We assume that the 
complex envelope F(t — rj/c) varies so slowly that it is approximately the same in 
all the transducers; differences in the times at which the echo arrives at the various 
elements of the array are negligible in comparison with the duration of F(t — rj/c): 



'KM-f). 



where R' is the distance from the target to the center of the array. 
The distance r, is given by 

rj = [R 2 + ( Xj - u x f + (yj - u y ) 2 f 2 = [R 2 + %■ ~ u) 2 ] !/2 
= [R 2 + u 2 - 2fe ■ u + W = [R> 2 - 2gj • u + %f 2 

■1/2 



where 



R' 2 - R 2 + u 2 . 



We expand the bracket in a power series and assume that the source is so remote 
that we can neglect terms involving fy/R' to powers greater than the first, and we 
obtain 

where 

6 - U' 5?; 

is a 2-vector specifying the direction of the target. After shifting the time variable t — 
R'/c — » t and absorbing all the neglected terms into the phase *Jj, which is unknown 
and assumed uniformly distributed over (0, 2tt), we write the quasiharmonic signal 
from the jth transducer as 
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sj(t) = A Re F(0 exp(ifir + + teg, ■ 8), (4-45) 

where A: = O/c = 2tt/\ is the propagation constant of the radiation and \ is its 
wavelength. This is known as the paraxial approximation. 
Comparing (4-45) with (4-2), we see that 

sj(t) - AF{t) exp ik^j • 0, (4-46) 

is the complex envelope of the signal from the jth transducer, and (4-8) prescribes 
that the receiver form the statistic 



r — 



X expHfc£ w -e) F*(t)V„(t)dt 



(4-47) 



and compare it with a decision level tq, deciding that a target is present when r' > rj,. 
Under the Neyman-Pearson criterion this test will be uniformly most powerful with 
respect to amplitude. For simplicity of analysis we adopt the proportional statistic 
r = \Z\, where Z is a circular complex Gaussian random variable given by 



Z = X + iY = ATW^Zn,, (4-48) 



M 

I 

z m = C «p(-i*€ m -6) [ T F\t)V m {t) dt, 
Jo 



C = N~^ 2 \\ \F(t)\ 2 dt\ . (4-49) 
|Jo j 

The receiver combines the outputs of the transducers after introducing phase shifts 
-kg m ■ 9 to compensate for the different phase delays in the paths from the target to 
the transducers. The combined output is then passed through a filter matched to the 
signal Re F(t) exp Hit, rectified, and sampled at the end of the observation interval 
to produce the statistic r, which is compared with a decision level r chosen so that 
the false-alarm probability takes on the preassigned value. When the outputs of the 
transducers are shifted in phase in that way, the antenna is called a phased array. 
The phase shifts -k£ m • 6 are just what are required to produce maximum output 
signai-to-noise ratio by linear combination of echoes coming from the direction 6. 

As with (4-11), the real and imaginary parts of z„, have unit variances under 
both hypotheses—see (4-23)—, and because the z /)f 's from different transducers are 
statistically independent, the real and imaginary parts X and Y of Z in (4-48) 
are also independent Gaussian random variables with unit variance. The statistic 
r = \Z\ therefore has a Rayleigh distribution 

Po(r) = re- r2/2 U(r) 

under hypothesis Hq. 

When echoes from a target are present, the statistic r will have the Rayleigh- 
Rice distribution of (3-73). Let us evaluate the signal-to-noise ratio d 1 appearing 
in that density function under the supposition that the target lies in a direction 
8' = u'/R' that may differ from the direction from which the echoes are expected to 
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come. In the signal Sj(t) in (4-46) we now replace 6 by 8', whereupon the expected 
value of the circular complex Gaussian random variable Z is 

E{Z\ H U 6') = £L £ exr« m • (8' - 6)] I V(0I 2 <* = i><?(6' - 0) *'+, 

* m=l 

Z> 2 = J#££ |f (/)| 2 * = ^ (4-50) 

is the total signal-to-noise ratio for a source at 8' = 6; £7- is the total energy received 
from the target. 
The function 

G(8) = ~^f>xpifcg m -8 (4-51) 

is the amplitude gain pattern of the array. The false-alarm and detection probabilities 
for a source at 8' are now, as in Sec. 3.4, 

Qo = e~ b2/2 , Qi = Q(D«,b) (4-52) 

in terms of Marcum's Q function (3-76), with 

Dlx = D 2 \G(& - 8)| 2 

the effective signal-to-noise ratio. 

The factor |G(8' - 8)| 2 is often called the beam pattern of the array. It is 
maximum in the direction 8' = 8 for which the outputs of the transducers have been 
phased. When the transducers are very small and very close together, the sum in 
(4-51) can be approximated by an integral, 

<?(6) = 7 f f /(© e™*' 9 * d\ d 2 Z = dx dy, 

A J -co J— co 

where /(£), the indicator function of the array, equals 1 for points £ inside the array 
and for points (; outside it, and A is the area of the array. The gain pattern is the 
two-dimensional spatial Fourier transform of the indicator function of the array, a 
relation familiar from antenna theory. 

For a circular array, for instance, of radius a, 

ira 2 J Q Jo \ a / 



a 2 Jo 



to 

in terms of the Bessel function of first order. This is known as the Airy pattern. 
Because the first zero of Ji(y) is at y = 3.83171, the angular width of this pattern 
between its first nulls on each side of the axis is 1.220(\/a) radians or 69.88(X/a)*. 
The maxima of the first sidelobes of this pattern lie about 17.6 dB below that of the 
main lobe. The broader an array, in general, the narrower its effective beam pattern 
for reception of signals from a distant source. 
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The process of picking the right phases and other weighting factors, if required, 
for the outputs of an array of transducers in order to maximize the output signal- 
to-noise ratio and, possibly, to meet other specifications is called beamforming. A 
review, with references to the earlier literature, is to be found in [Cox73]. 

4.3.2 Elimination of Noise from a Point Source 

In Sec. 4.3.1 we assumed that the external noise comes in from so broad a cone of 
directions that the responses nj(t) to it in the several transducers are uncorrected 
and hence statistically independent. In sonar surveillance the incoming echoes may 
also be corrupted by acoustic noise from a localized source such as a passing ship. 
Let us see how and to what extent that can be overcome. 

For simplicity we assume that the noise source is small enough to be considered 
a point; roughly speaking, this entails that the cone of directions from which the 
noise comes be much narrower than the gain pattern in (4-51). Let the direction 
of this noise source by given by the 2-vector w. It is far enough away that the , 
approximation (4-44) can be applied to determining the differences in the phases of 
noise waves reaching the several transducers of the array. Like the other ambient 
noise, this additional, localized noise has a spectral width, we assume, much broader 
than that of the echo signals, so that it can be treated as white. 

The complex envelope of the noise component of the output v } (t) from the jth 
transducer has the form 

Nj(t) = iV} <0 + M(r)«p 

where N) (t) is the complex envelope of white noise of spectral density N, indepen- 
dent from one transducer to another, 

i^Wi VCo('2)J = A^Sto - 1 2 ), 

and N)(t) is the complex envelope of the noise from the point source, whose spectral 
density is N s ;N J0 (t) and Ni(t) are of course statistically independent. Then the 
complex coyariance function of the noise from the yth and the mth transducers is 

§E[Nj(t } )N*(t 2 )} = [m jm + N 5 exp ik% - g w )-w]5(fi - t 2 ). (4-54) 

The signals sj(t) arising from the echoes our receiver is trying to detect will 
again have complex envelopes of the form in (4-46). Because the noise is white, the 
optimum receiver will as before pass the output Vj(t) of each transducer through a 
filter matched to the signal Re F(t) exp iOt, where F{t) is the form of the complex 
envelope of the arriving echo. We can assume, therefore, that its decisions will be 
based on the M circular complex Gaussian random variables 

zj = c{ T F*(t)Vj(t)dt, (4-55) 

JO 

where C is as given in (4-49). 

How these data zj shall be combined is determined by their likelihood ratio 
p}({zjY)/Po({zj}), where pj({zj}) is the joint probability density function under hypoth- 
esis H,- of the real and imaginary parts of the M circular complex random variables 
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In order to write down these circular complex Gaussian density functions, 
which have the forms (3-40) and (3-41), respectively, we need the complex covariance 
matrix $ = p, -1 and the expected values Sj - E(zj\ Hi) of the random variables zj. 
When the signals in (4-46) are present, by (4-55), 

\F(t)\ 2 dt - NT X/2 D e f * exp ikgj ■ 0; (4-56) 

o 

Z> 2 , defined as in (4-50), is the effective signal-to-noise ratio when no point source 
of noise is operating and N s = 0. 

The elements of the complex covariance matrix 4> of the z/s are, by (4-54) and 
(4-49), 

% m = \E(zjZ^\ Ho) = \C 2 E f f F\h)F{t 2 )Nj{tx)K(t 2 ) dh dt 2 

Jo Jo 

= C 2 [ T \F{t)\ 2 dt [Nh jm + N s exp ifc(£, - g») • w] (4-57) 

N 

= 8 yw + h exp ik%- - £ m ) • w, h = 

h is the ratio of the spectral density of the noise from the point source to that of the 
original ambient noise. 

The elements of the inverse covariance matrix \i = are 

»>mn = S>mn - f\ exp ikUrn ~ £*) * W, If) = JTM' 

To prove this, we substitute (4-57) and (4-58) into the usual definition of an inverse 
matrix: 

M M 

X = X E5/m + kexpik(Zj -g^-wP™, -T\expik(£ m -| n )-w] 

772-1 m=l 

= 8y B + A exp &(£,■ - g n ) • w - T| exp - £ „) • w 

-A^ni exp ik(£j - % „) ■ w 

because A — — Mkt\ = 0. 

Dividing (3-41) by (3-40), we find the likelihood ratio 

|$f§ = *xp|jl I V»W*. + */«■ " 

which we see depends on the data {zj} only through the sufficient statistic ReZ, 
where, by (4-56), 

/T7 w M MM 

7=1 m=\ j~l m=l 

W 

~ 6 ^ gttjZmi 



162 



Detection in Multiple Observations Chap. 4 



with the weighting factors g m defined by 



M 



8™ ~ X ex pW*fe * 6) 

' =1 (4-59) 
= expHfcg„ ■ e) - (3G(w - 0) exp(-*£ M • w), p = M 



1 + Aft* 

<?(•) is the amplitude gain pattern of the array defined in (4-51). Because the 
phase 4* of the carrier of the echo pulses is unknown and can be assumed uniformly 
distributed over (0, 2ir), the optimum detector will base its decisions on the statistic 



M 

r = |2|, Z = X + iY = 5> m2w 

with the weighting coefficients g m given by (4-59). 

Let us once again assess the effectiveness of our receiver for detecting a target 
located not in direction 0, but in direction 0'. Then from (4-55) and (4-46) 

E(z m | Hi , 60 = AT l/2 D £?'■* exp ikt, m • 0'. 

Hence the expected value of Z is 

n M 

E(Z\ H U 60 = e>* £ g m exp ftfc, • 0' 



= "n| * Z ex P ' (»' ~ 0) ~ - 0) £ exp .(0' - w) 

[m~\ j 

= />V3/e'+[C(e , -e)-pc(w-e)G(e'-w)]. 

The variances of the real and imaginary parts of Z are 

MM 

<r 2 = Var * = Var r = i£(ZZ*j tf ) = i£ £ I H ) 

W M M M M 

= Z Z Smbmngi = £ £ £gm<l>iiiitM«!/ exp ifcg; *0 
m=I /i=l w =l n =l j=l 

A/ A/ M 

= £ Z & n8 «y ex P *S/ * 8 = Z &» ex P * e 

= A/[l - (3|G(w - 0)P]. 

The false-alarm and detection probabilities are again given by (4-52), but the 
effective signal-to-noise ratio is now 

' I-Pl<?(w)P 
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Figure 4-11. Gain pattern |G'(6'; w)| 2 (dB) in the plane of the noise source and 
the perpendicular to the circular transducer array versus y ~ 2-iralO'IA; 
2ira|w|A - 1. Solid curve Mi = 0; dotted curve: Mi = 2; dashed curve: Mi ~ 
10; Mh ~ m s /N. 



When the beam is "on target", 8 = 8', and the effective signal-to-noise ratio is 

Z> e 2 ff = D 2 [l-3|G(w-8)! 2 ]. 

If target and noise source are far apart, nearly equals D 2 ; if, on the other hand, 
they lie in the same direction, 6 = w, the effective signal-to-noise ratio is reduced by 
the factor (1 + Mh)~ l . 

Figure 4-11 illustrates how the strength N s of a point noise source affects the 
shape of the optimum beam pattern. We take 8 = (0, 0), and we assume a circular 
array of radius a, so that the beam pattern when N s - is the Airy pattern given 
in (4-53). It is plotted in decibels as the solid curve marked 0; y = 2im|8'|A- The 
point source of noise is located at an angle w corresponding to a value of y equal to 
1. Thus with a = 10X, the first nulls of the Airy pattern lie at ±7% and the direction 
of the noise source makes an angle of 1.8° with the perpendicular to the array. The 
figure shows the function |G'(8 f ; w)| 2 for directions 8' lying in the plane determined 
by the direction w of the noise source and that perpendicular. The function is 
represented in decibels for values of Mr equal to 0, 2, and 10. As A/ft = MN S /N 
increases, the first right-hand null in the beam pattern moves from y = 3.83171 
toward the direction (y = 1) of the noise source, and the relative strengths of the 
sidelobes increase. The optimum beam pattern reduces the gain in the direction of 
the noise source as much as it can without unduly decreasing the gain in the direction 
8 = (0, 0) where a target is expected to lie. 
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4.4 COMPARISON OF RECEIVER PERFORMANCE 



4.4*1 Asymptotic Relative Efficiency 

A set of signals is either present in or absent from a sequence of M inputs Vj(t) to 
the receiver, < t < T, 1 <j < M. In order to establish a simple way of assessing 
how effectively the receiver detects these signals, we shall suppose that it processes 
inputs vj(t) that are statistically homogeneous. We first describe conditions under 
which statistical homogeneity occurs. 

Denote the signal in they'th input by s(t; a h By, 0"), where a } is a parameter 
representing the strength of the signal, and 6/ and 9" are sets of other parameters 
characterizing it. The parameters designated collectively by 6j are independently 
random from one signal to another, and we assume that for all j G (1, M) their prior 
probability density functions z(8') are the same. The phase tjy of a quasiharmonic 
signal is a typical element of B- . The parameters 8" are the same in all the signals; we 
call them the invariable parameters. The arrival time t of a radar echo from a fixed 
target, for instance, would be an element of 8". If the target is moving at constant 
velocity, the Doppler shift w of the carrier and the arrival time tj of, say, the first 
signal make up 0" = (n, w); the arrival times of the other signals can be expressed 
in terms of ti and w. In detecting a moving target the receiver can compensate for 
the changing arrival times y of the echoes from one input to another if the Doppler 
shift w is known, for the Doppler shift is proportional to the velocity of the target 
in the direction of the radar antenna. The signal strengths aj are assumed either to 
be the same in all inputs or to be independently random from one input to another 
and identically distributed in each. 

The y'th input is 

Vj(t) = «,-(/), j = 1,2, ... , M, < t < 7\ 
under hypothesis Ho and 

Vj(t)^s(t;aj,Q'j,Q") + nj(0 

under hypothesis H\ . The noise nj(t) is a stochastic process assumed to have identical 
statistical descriptions in all the inputs, and the noise in one input is statistically 
independent of that in another. The inputs may be received simultaneously in disjoint 
frequency bands, or they may be received one after another in the course of time. 
The receiver processes each input in such a way as to produce a single datum 

Sj =%(/)], j = 1,2,...,M, 

which is a specified functional f[-] of that input. The functional f[-J may depend 
on certain standard values of the invariable parameters 8 ;/ . A radar receiver for 
detecting a target at a fixed location, for instance, will have its output sampled at 
the time when an echo from the target is expected to appear. The datum gj might 
then be 

If 7 " 2 
g } = F*(t;B")Vj(t)dt , 
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where ajF(t; 0") is the complex envelope of the signal and Vj(t) is the complex enve- 
lope of the/th input vj(t) = Re Vj(t) exp iflt. As a consequence of our assumptions, 
the random variables gj under both hypotheses Ho and Hi are statistically indepen- 
dent and identically distributed for all j E. (1, M). Their joint probability density 
functions have the form 

M 

■ Po(g) =Po(gug2, — ,ghd = n^ 0< &')» 

M 

^i(g) =/ j i(gi>^2, >gM) - n^ 1 ^^ 

g standing for the set of all M statistics gj. The density function pi(g) is assumed 
independent of the values of the random parameters 6/ in the signals actually present 
under hypothesis Hi. 

The receiver bases its choice between hypotheses Hq and Hi on the M data 
g = igu g2, ••■ , £at). In statistical terminology these are often called samples, but 
they must not be confused with the samples of an input v(t) that we utilized in 
Chapter 2 to treat the elementary problem of detecting a known signal in Gaussian 
noise. The datum gj would ideally represent the result of the optimum processing 
of the jth input; that is, 

gj = lnA[w,-(0;a y ,e''J, 

where 

A[z>,(r); a J} 6"] = I z(fyA[Vj(t); a,-, 0j, 6"] cT'ef (4-60) 

is the likelihood functional for the 7th input averaged with respect to the prior prob- 
ability density function z(8j) of any random parameters. This optimum processing 
would require knowing the values of the invariable parameters 8" or the assumption 
that they take certain standard values. Most often gj will simply be the result of 
some convenient filtering and rectification of vj(t). 

However the inputs vj(t) may have been processed to create the "samples'* {gj} t 
the optimum way of utilizing them is to form their likelihood ratio 

A(B) - Mi)' 

which is compared with a decision level Ao determined by one's criterion of opti- 
mality, Equivalently 



rj = lnA(g) = fln[^l 
j?l L^feyOj 



is compared with the decision level Uq = In Ao- Other schemes are conceivable, 
however; the g/s may simply be added to form 

G = f>, (4-61) 



166 



Detection in Multiple Observations Chap. 4 



or, as we shall see when we treat nonparametric detection in Chapter 8, they may 
be ranked in order of magnitude and a decision statistic derived from their places in 
that order. 

Determining the false-alarm and detection probabilities for receivers that base 
their decisions on statistics such as U and G formed from a large number M of 
samples may involve lengthy computations. In Chapter 5 we shall describe approx- 
imating techniques, but even these may not always be expeditious. One would like 
to have at least a crude way of assessing and comparing the performance of re- 
ceivers based on various functional f [ ■ ] and various schemes for processing the 
samples {gj}. The simplest way seems to be to compare receivers on the basis of 
their asymptotic relative efficiency, which we shall now define. 

A fixed pair (Q , Qd) of false-alarm and detection probabilities is adopted as 
the standard of performance, with Q Q d « \. A typical pair would be 

(10 6 , 0.99). We call the pair (Q , Q d ) the reliability of the receiver. The inputs 
to a receiver are taken as statistically identical and homogeneous in the sense just 
described. Two receivers processing the same kind of inputs {vj(t)} and attaining the ' 
same reliability (g , Qd) are said to be equipollent. 

Let M\ be the number of independent inputs required by receiver 1 and let 
M 2 be the number required by receiver 2 in order for each to attain the reliability 
(Qo, Qd)- Then the asymptotic relative efficiency (a.r.e.) of receiver 2 with respect 
to receiver 1 is defined as 

a - r - e - 2:! = a s£- < 4 - 62 > 

where a is a parameter specifying the strengths or the average strengths of the signals 
under hypothesis Hi . In this limit a -» the numbers M\ and M 2 go to infinity; it 
is the limiting value of their ratio that matters. 

Let us suppose that each receiver bases its decisions on the sum 

M 

Gk=£gj { \ gj k) = h[vj(t)], k = 1, 2, (4-63) 
j=\ 

of the values of a certain functional f * [ • ] of its input. When the signal strength a is 
very small, Mi and M 2 are very large, and by the central limit theorem the statistics 
G\ and G 2 have approximately Gaussian distributions. With G k0 the decision level 
on statistic Gk, 

G k >G k0 =>~^ Hi, k = 1,2, 
the false-alarm and detection probabilities are approximately 

ft-etfc*. x = "0), 

aj = Var(G k \Ho) = Var G k , 

a?! = Var^ltfO, 

where 
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Var G k v v '' 

is the effective signal-to-noise ratio at the output of receiver k. 

When each statistic is the sum of independent and identically distributed 
random variables as in (4-63), 

E(G k \ Hi) = M k E{g (k) \ H { ), 
Var(G*i Hi) = M k Var(g<*>| Hi), k = 1, 2; i = 0, 1, 

where g^ = f* [v(t)], we can write for the reliability 

Qo * erfc x, Q d « erfcf ^(x - A4 1/2 Z>*)I (4-65) 

Here 

_ 2 _ [^[ffQ-^Uo)] 2 

^ VarTi^ (4 " 66) 

is the effective signal-to-noise ratio of the output of processor fc, fc - 1, 2; Z>£ is 
sometimes called the deflection and often is quoted in decibels (dB). Neither adding 
an arbitrary constant to the statistic g (k) nor multiplying it by an arbitrary constant 
alters the deflection D\. 

If gj k) is a linear functional of the input vj{t), a^ = ajj. Usually, however, it 
is nonlinear, and the variances ct| and o\\ differ because of the interaction between 
the signal and the noise under hypothesis H\\ but in the limit a — ► they become 
equal. Equating the reliabilities (Qq, Qd) of the two receivers is then equivalent to 
putting 

jWi 1/2 Dj = Af 2 ,/2 2) 2 , 

whereupon the asymptotic relative efficiency of the two equipollent receivers .is the 
ratio of their deflections, 

a.r.e.>i = lim -~ (4-67) 

in the limit of vanishing signal strength a. The concept of asymptotic relative effi- 
ciency, attributed to Pitman [Pit49], has been applied to signal detection by Middle- 
ton [Mid60a, Ch. 20], Capon [Cap61], and others. 

When the signal strength a is very small, the deflection for a statistic g can be 
written 

d2 = [E(g\Hua)-E(g\H Q )] 2 
Varo g 

<4-68) 



where 



^^[l^'^Uf (4 - 69) 

is called the efficacy of the detector g = f[v(t)]. The signal strength a is so defined 
that the first derivative of E(g\ H\,a) is the one of lowest order that does not vanish 
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at a - 0. The asymptotic relative efficiency of two equipollent detectors is then the 
ratio of their efficacies, 

a.r.e. 2:1 = — . 
ill 

The asymptotic relative efficiency is particularly useful for comparing receivers when 
the noise is non-Gaussian [Kas88] or the mode of processing the inputs is too compli- 
cated for the probability of detection to be calculated exactly. Examples will appear 
in Chapter 8. 



4.4.2 Threshold Detection 

The receiver that shows up best in comparisons based on asymptotic relative effi- 
ciency is the threshold detector, which as in Sec. 3.6.3 is defined by 

ga = ga[v(()lQ") = ^ \ =Q , (4-70) 

where A[v(t); a, 6"] is the likelihood functional averaged over the random parameters 
8' as in (4-60). Again a is a parameter measuring the strength of the input signal and 
defined in such a way that the derivative d k A/da k of lowest order k that does not 
vanish at a = is the first [Rud62]. In the considerations of Sec. 4.2.1, for instance, 
a = {d k ). The threshold detector maximizes the average detection probability in the 
limit a — » 0; in that limit 

A[v(t); a, e"] « 1 + agaW)\ »"]- (4-71) 

In order to show that the receiver based on the likelihood ratio is optimum in 
the sense that it has maximum effective signal-to-noise ratio, we consider an arbitrary 
functional G = f[{vj(t)}] of the set of M inputs Vj(t)J = 1,2,..., M. Its effective 
signal-to-noise ratio is, as in (4-64), 

Var G ^' U) 

Let us sample the set of inputs {vj(t)} by some appropriate means, as was done for 
instance in Chapter 2, obtaining a set v of samples that we eventually take to be 
infinitely numerous. Then G = G(v) is a function of the set of samples, and when 
we limit ourselves to only n of them, its expected value under hypothesis is 



E{G\H\,n) = f G(v)pi(v)d"v, 



where p](v) is the joint probability density function of v - (v u ... , v n ). We can then 
write 



£(G| f G(v)A(v)p (v)d"v = E(GA(v)\ H ,n), 



where 



A(*) = ^ 
Po(v) 
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is the likelihood ratio. When we pass to the limit n oo of an infinite number of 
samples, this likelihood ratio becomes the likelihood functional 

A[{q(/»;a,n 

which we abbreviate as A(v), and we find 

£(<?! Hi) = E[GA(v)\ Hq]. (4-73) 

Putting G = 1, we find, furthermore, that E[A(p)\ Hq] = 1. 
We can then write for the numerator in (4-72) 

[E(G\ HO -E(G[ Ho)? = [E{G[A(v) - 1]| H Q }f 

= [e{(G-G )[A(v)-1]\h }]\ 

where Go = E(G I Ho) is nonrandom. The Schwarz inequality for expectations states 
that for two random variables A and B, 

[E(AB)f < E(A 2 )E(B 2 ), (4-74) 

with equality when A = cB, c any nonrandom constant [Hel91, p. 186], [Pap91, 
p. 154]. Applying this with A-G-Gq and B = A(v) - 1, we find 

[E(G\ HO - E(G\ Ho)} 2 < E[(G - G ) 2 I H Q ]E{[A(v) - 1] 2 | H ] 
= Var G Var [A(i;) - 1], 

and by (4-72) we obtain an inequality on the effective signal-to-noise ratio: 

A 2 < Var [A(u) - 1]. 

Equality obtains when 

G = c[A(v) - 1] = c[A[{z>y(/)}; a, 6"] - l]. 

Thus for a fixed number M of inputs vj(t), the likelihood-ratio receiver will attain the 
largest effective signal-to-noise ratio among all ways of processing them. In the limit 
a -* 0, A(v) - 1 becomes proportional to the sum of threshold statistics g a [Vj(t); 8"] 
as defined in (4-70). 

Putting G = A(v) - 1 into (4-73), we find 

E[A(v) - II /fi] = E{A(v)[A(v) - 1]| Ho] 

= E[[A(v) - 1] 2 | Ho] = Var [A{v) - 1]. 
The effective signal-to-noise ratio for the Hkelihood-ratio detector is therefore 

A 2 = £{[A(t>)~l]|tf,}, 
and by (4-71) that of the threshold detector is 

A 2 = Mo£feKO;e"]|Hi} (4-75) 

with g a [v(t); 6"] defined in (4-70). This often provides the simplest way of calculating 
the effective signal-to-noise ratio of the threshold detector. 
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4.4.3 Comparison of the Linear and the 
Quadratic Detector 



As an example, consider the detection of a known quasiharmonic signal in white 
Gaussian noise. As we learned in Sec. 4.2.1, the threshold detector utilizes the 
functional 

g® = \\ F*(t)V(t)dt , 
|Jo 

which is the output, sampled at time t = T, of a quadratic rectifier following a filter 
matched to the signal Re F(t) exp /ft/. Compare that receiver with one utilizing an 
mth-law rectifier instead, 



g {m) = \\ T F*(t)V{t)dt 
Jo 



In Problem 4-7 you are asked to calculate the asymptotic relative efficiency of these 
two receivers. The asymptotic relative efficiency a.r.e. 2:w is always greater than 1 for 
m # 2. For a linear rectifier (m = 1) 

axe.*, = * = fcl> = 1.09296. 

For these receivers the signal-strength parameter a is proportional to the 
squared amplitude A 2 of the input signal, that is, to the input energy-to-noise ra- 
tio S = E/N; E is the input signal energy and N the unilateral spectral density 
of the input noise. Indicating the receivers by the subscript k = 1, 2, and taking 
Mi = M 2 = M » 1, we find from (4-66) and (4-68) 

D\ = Sfa = D\ = S 2 2 tj 2 , 

whence the ratio of input energy-to-noise ratios needed for the two receivers to attain 
the same reliability must be 



S 2 \ 



-H2\ ,/2 

~J = !.04545 s 0.193 dB. 



That is, the receiver with a linear rectifier requires 0.193 dB more input signal energy 
than one with a quadratic rectifier when M » 1. 

The relative standings of these receivers are reversed when M is less than about 
100, and the receiver with a quadratic rectifier is superior to one with a linear rectifier 
only when the number M of inputs is so large that the first term of (4-14) is an 
adequate approximation in (4-13) for signals and noise of such strengths that the 
receiver attains an acceptable reliability (Q , Q d ). As we shall now demonstrate, 
however, for small numbers M and for reasonable values of the probabilities Q 
and Q d , the average value of d m r m in (4-13) is so large that it is, for most of the 
observations, well beyond the quadratic part of the curve in Fig. 4-1, even when no 
signal is present. 

From (4-29) with V - \r 2 ,M = 1, it follows that with no signal present the 
random variable r has the Rayleigh distribution 
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p Q {r) = r e~ rl/2 U(r), 



and its expected value under hypothesis Ho is 

E(r\H Q ) = r = ^rp (r)dr = {-j . 

Hence, if all pulses have equal amplitudes, the average value of the argument d m r m 
of (4-13) is 

when no signal is present; S is the total energy-to-noise ratio. If we are concerned 
with signals yielding, say, a detection probability Qj = 0.90 for a false : alarm prob- 
ability go - 10"*, we can utilize a routine for calculating the inverse Marcum Q 
function to determine the following average values of the arguments d m r m under 
hypothesis H$, 

M 1 20 50 100 200 
d m r m 5.72 1.81 1.34 1.08 0.89 

Referring to Fig. 4-1, we see that all but the last two of these are located on a part 
of the curve of In I {x) that is nearly linear. When a signal is present, the average 
value d m T m is even larger, Because the values of the random variables d m r m tend 
to cluster about their averages d m 7 m) most of them will lie on the linear part of the 
curve of Fig. 4-1 for values of M up to about 50 or 100. 

In Fig. 4-12 we have plotted the ratio S1/S2 of input energy-to-noise ratios 
required for equipollence of the linear and quadratic detectors with finite values 
of M, under the assumption that the false-alarm number as defined in (4-42), 
equals 10 6 . Curves are presented for four values of the probability Qj of detection. 
These curves were computed as shown in [Hel90] by methods to be described in 
Sees. 5.2 and 5.3. 

This ratio approaches 0.193 dB as M increases, but only slowly; and the closer 
Qd lies to 1, the slower does the ratio S1/S2 converge toward that asymptotic value. 
Comparing the performances of two receivers through their asymptotic relative ef- 
ficiency depends, as in (4-65), on the assumption that their decision statistics have 
Gaussian distributions, but the farther the decision level lies in the tails of the dis- 
tributions under hypotheses Ho and H\, the poorer that Gaussian approximation is. 
We see, furthermore, that for M less than about 100 the linear detector is the better 
of the two, in accordance with our remarks at the beginning of this part. 

4.4.4 invariable Parameters Unknown 

The threshold detector in (4-70) entails adopting some standard values of the invari- 
able parameters 8"; these are by definition the same in all M inputs Vj(t). If the 
values of 0" are unknown a priori and the receiver is to be designed for a broad 
range of those values, some prior probability density function z(8") must be adopted 
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Figure 4-12. Relative efficiency R - E - 10 tog 10 (Si/S 2 ) of linear and quadratic 
detectors as function of the number M of inputs summed; jVj a = 10 6 . Curves are 
indexed with the probability Q d of detection. [Reprinted from C.W. Heistrom, 
"Performance of receivers with linear rectifiers," IEEE Transactions on Aerospace 
& Electronic Systems, vol, AES-26 (Mar, 1990), 210-7, ©1990 IEEE.] 

for them, and the overall likelihood functional must be averaged with respect to it 
to form 

A[ty(r)fcfl]= f z(#')A[{v j (t)ha ) # t ]d M "Q» 

M 



where m" is the number of invariable parameters 6" and ©" is the space in which 
they take their values. In the weak-signal limit a -* 0, by (4-71), 

ft Afot(0; «, e"] * fj P + ^W); e"J} 

*=1 k-\ 

M 

-t=i 

to first order in a, and 

HiMOh a] * i + a £ z(e")g«t^(0; *"3 ^ w "e". (4-76) 
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In the limit a -* 0, therefore, the invariable parameters are treated in the same way 
as the parameters 8' that are independently random from one input to another, and 
we can combine them into = (6', 6") to write the threshold statistic derived from 
(4-76) as 

W)i] = ^M'L = I^('»- (4 " 77) 

where for all k E (1, M) 

fcMO] = a] L„ = Bm [ a -'{AteW; a] - 1}] (4-78) 

with 

A[v(t); a] = zmA[v(t); a, 8] d m Q. (4-79) 
Je 

Here A[v(t); a, 8] is the likelihood functional for detecting the signal s(t; a, 8) in any 
one of the M inputs vj(t), and z(0) is the joint prior probability density function of 
the m parameters 8 = (8', 0") other than the signal strength a. 

In Sec. 7.6 we shall show how to determine a prior probability density function 
for the parameters 8 that is least favorable in the sense that the threshold receiver 
based on it has minimum efficacy among all possible prior density functions 2(8). 



4.5 DISTRIBUTED DETECTION 

A distributed-detection system consists of a number M of detectors or sensors in- 
tended to pick up a common electromagnetic or acoustic disturbance of some pre- 
scribed form, the "signal." Placed in separate locations, they transmit information 
about their ambient fields to a central processor, or fusion center, which makes the 
final decision about the presence or absence of the disturbance. The sensors do not 
communicate among themselves. In order to reduce to a bare minimum the com- 
munication links to the fusion center, each sensor transmits only 0's or l's: if it in 
some way decides that the signal is present, it sends a 1, otherwise a 0. 

The kth sensor processes its input u*(0 during an observation interval (0, T) to 
produce a datum gk indicative of its estimate of the strength of the signal component 
of its input. If gk exceeds a certain decision level g k0 , the sensor transmits a 1 to 
the fusion center, otherwise a 0. Every T seconds the center receives a set of M 0's 
and l's, and on the basis of these it decides between the null hypothesis Hq that the 
disturbance was absent, the inputs v k (t) consisting only of random noise, and the 
alternative hypothesis H\ that the disturbance was present in the fields incident on 
the M sensors. 

How can each sensor most effectively process its input, and how can the fusion 
center optimally make its decisions? The latter has a great variety of possibilities. 
Depending on what it knows about the probability distributions of the inputs v k (t) 
under hypotheses Hq and H\, it may assign different weights to the 0's and l's 
received from different sensors, and the environment of some of the sensors may be 
so noisy that it is best to disregard their responses altogether. 
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The datum g k produced in the kth sensor can be represented as a functional 
Zk = \k[vic(t)] of its input. How that functional is best designed and how the decision 
level g k0 on g k is set will depend in general on the joint probability distributions of 
the inputs to all the sensors and on the rule by which the fusion center makes its 
decisions. In principle the optimum system will minimize the average cost of its 
operation, or it will maximize the probability Q d of deciding that the disturbance 
is present (hypothesis H{) for a preassigned value Q of the probability of declaring 
for Hi when Hq is true; a Bayes or a Neyman-Pearson criterion may be adopted. 
Needless to say, determining the optimum system is, except under special conditions, 
quite complex. 

4.5.1 identical independent Sensors 

For simplicity let us consider a system in which the M sensors are identical and 
in which their inputs {v k (t)} are independent and identically distributed. The func- 
tional f*[-] will then likewise be identical. Under hypothesis H the kth input 
is 



the nidt) are independent Gaussian random processes with identical autocovariance 
functions. Under H\ the Hh input is 



m which the signal components s{t; 9*) have the same known form, but may depend 
on certain unknown parameters Q k , assumed statistically independent from sensor 
to sensor. Typically 0* may be a random phase 



with each fy k uniformly distributed over (0, 2tt). Alternatively, the signals may fade 
independently, 6^ = '(Ak,tyk), 



and the random amplitudes A k may, for example, have a common Rayleigh distri- 
bution as in (4-17). Each sensor will then optimally pass its input through a filter 
matched to the signal Re F(t) exp Mr, the output, rectified and sampled at the end 
of each observation interval, produces a datum 



where V k {t) is the complex envelope of the fcth input. 

If, on the other hand, the signals, when present, are expected to be periodically 
repeated, but with independently random phases and possibly independently random 
amplitudes, the datum g k might as in (4-19) be the sum of quadratically rectified and 
sampled outputs of a matched filter. In any case, we assume that the probability 
density functions p (gk) and p\{g k ) of each datum under the two hypotheses are 
known and identical from sensor to sensor. The kth. sensor will then transmit a 1 



v k {t) = n k (t); 



s*(O = «*(0-M';e*), 



s(t; ifoO = A Re F(t) exp(/Of + ity k ), 



s(tl A k , tK) = A k Re F(t) exp(Mt + ityk), 
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to the fusion center if g k > go and a if g k < go; the decision level go is 
in all sensors. 

The probability of each 1 is 



#(go) 



= fpt(g)dg* i = 0,1, 



under hypothesis Hi, and the probability of a succession of m l's and M 
received at the fusion center in a given order is 

Pr(M - m 0's, m l's| Hi) = (1 - qd M ~ m q?> 0<m<M. 

The likelihood ratio for this observation is a function of m, 

A(m) = a - 

A{m) (1 - ?o)"-™?o" 

and we see that the number m of l's is a sufficient statistic for deciding beUu- 
two hypotheses. Ideally the fusion center applies a randomized strategy as dc 
in Sec. 1.2.5, selecting hypothesis H\ whenever the number m exceeds a decisis 
m ; if m - mo, it selects H\ with probability/. It chooses Ho whenever m 
The probability under hypothesis Hi of observing m l's and M - m 0's is 

Ptim, go) = ( M )qrQ ~ 9i = ftto). ' = °> I- 



Si I 

I 



The conversion of each datum g* into or 1 is called quantization, :uu\ 
the quantization level. Under the Neyman-Pearson criterion go should be elm ■->■ 
that for a preassigned false-alarm probability Q$ the probability Qj of deteciion 
large as possible. Instead of being generated in M distributed sensors, the d;iu 
might simply be the outputs of a processing f[Vk(t)] of inputs to a single ie. 
during a succession of M intervals of duration T, as in Sec. 4.1.1. The u-« 
then counts the number of times its M outputs gk exceed the level go and m:i \ < 
decision about the presence or absence of a signal in the manner just describe, 
this context, the procedure is called binary integration [Sch56], [Sch75, pp. 2 

Through (1-57), (1-58), and (4-81) the false-alarm and detection prob;ii« 
are functions of the quantization level g ; 

Qoigo) = fPoim* go) + X p ^ m * 1 

M 

Grf(go)=/P t (/M 0> go)+ X p ^Sq)- 1 
m=mo+l 

Given a value of the quantization level go, we determine the integer m ;im 
probability/ by the method outlined in connection with (1-57), so that Qaiga) 
the preassigned false-alarm probability. The probability Qdigo) of detection r. 
given by (4-83). 

For Rayleigh-fading signals, for example, we can take, as in Sec. 4.2.3. 



176 



Detection in Multiple Observations Ch 



.00 

Figure 4-13. False-dismissal probability Q\ - 1 - Qj for Rayleigh-fading signals 
in a distributed-detection system with M - 15 sensors as a function of the quanti- 
zation level go; S = 6.2827, go = 0.001. 



go - exp(-g ), go > 0, 

* = -NM (4 - 84) 

where S is the average energy-to-noise ratio at the input of each sensor. In Fig. 4-13 
we show how the false-dismissal probability Qi(g Q ) = 1 - Q d (g ) depends on the 
quantization level g for M - 15, Qo = 0.001, and S = 6.2827. Each cusp represents 
a pure strategy (/ = 0). As ^ increases, the probability / increases from to 1 on 
ouch convex branch of the curve, and the decision level m on the number m of l's 
decreases from one cusp to the next. 

To determine the optimum value of the quantization level g Q , it suffices to set 
/ ? in (4-82) and (4-83). We then equate the right side of (4-82) to the preassigned 
tube-alarm probability, and for a succession of values of m starting at m = 0, 
we solve it for go by Newton's method, substitute g into (4-83) to determine the 
probability of detection, and stop when this reaches its maximum value [Wor68]. 

For detection of a narrowband signal of random phase in Gaussian noise, with 
suitable normalization of the statistic gk, 

qo = exp(-£ ), q\ = Q(d 3 vS), (4-85) 

»* in (3-70) and (3-75), where Q(-, •) is Marcum's Q function (3-76). In order 
to show how much is lost by quantizing the data {g k } instead of transmitting them 
unchanged to the fusion center and having it utilize the threshold statistic U in (4-19), 
we calculated the ratio of the total energy-to-noise ratio S T = \Md 2 required with 
tiuantization to the total energy-to-noise ratio S v required when the receiver bases 
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Figure 4-14. Loss (dB) in input energy-to-noise ratio when quantization is used 
instead of the statistic U in (4-19) versus the number M of samples. Solid curves: 
fixed signal amplitude; dashed curves: Rayleigh-fading signal. Curves are indexed 
with the false-dismissal probability. Q\ « 1 - 2© - 10" 6 . 



its decisions on the threshold statistic U, both systems attaining the same reliability 
(Go, Qd\ For the latter, Q is given by (4-30) and Q d by (4-26), with S v == \D 2 . The 
loss in input energy-to-noise ratio, defined as S r /Su, is plotted in decibels versus 
the number M of data as the solid curves in Fig. 4-14; two values 0.1 and 0.001 
of the false-dismissal probability Q\ = 1 - Qd were assumed, and Qq = 10" 6 . For 
M s= 1, of course, Sr/Sy = 1. The ripples in the curves result from integral jumps 
in the optimum decision level wo as M increases. 

When the signal suffers Rayleigh fading as in Sec. 4.2.3, the statistic g = V in 
(4-19) is the optimum detection statistic, and its false-alarm and detection probabil- 
ities are given by (4-30) and (4-31); in the latter, s 2 is the average energy-to-noise 
ratio S'u per pulse. The probabilities go(go) and qiigo) are given by (4-84). Again 
we calculate the values of the energy-to-noise ratios S and S'v required to attain the 
same reliability (Qo, Qd), and we define the loss as the ratio S/S'v. It is plotted as 
the dashed curves in Fig. 4-14. Quantization introduces a somewhat greater loss 
with Rayleigh-fading signals than with nonfading signals. 

It is natural that the optimum strategy will be a pure one (/ = or/ = 1), 
for the fundamental data are the M gk's, and they are continuous random variables. 
Randomization at the fusion center would introduce a kind of noise into the system, 
and we would expect the decision levels go to adjust themselves so as to eliminate 
it. In order to prove this in detail, we begin by differentiating (4-82) and (4-83) with 
respect to go, using (4-81). All but one of the terms in each summation cancel, and 
we are left with 
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"£o 1*0 dgo I go I - go J) 

Because <? is fixed, we can set the first of these derivatives equal to zero, solve it 
for df /dgo, and substitute that into the second, obtaining after a bit of algebra 

= So)Uf ± ln( —\ — (M — m„)(l -f)f infill)! . **** 

L ago \q Q J dgo U-flo/J 

Both ratios 

frteo) and ! -gi(go) 

?o(go) I - ?o(go) 

appearing in (4-86) are monotonely increasing functions of the quantization, level go- 
To see this, consider the operating characteristic of the sensor, which will resemble 
that in Fig. 1-2, corresponding to Q d and q to Q Q . The ratio q x /q Q is the slope 
of the straight chord from the origin (0, 0) to the point on the curve determined 
by the value of go. As go goes from -oo to oo— or as in our examples, from to 
oo— that point moves from (1, 1) to (0, 0), and the slope of the chord increases 
monotonely. Likewise the ratio (1 - q x )/{\ - q ) is the slope of a chord from the 
point on the curve determined by g to the point (1, 1), and this slope also increases 
monotonely as the point g Q moves downward along the operating characteristic. 
Both the logarithmic derivatives multiplying m f and (M ~ w )(l -/) in (4-86) are 
therefore positive. 

The graph of Qd(go) versus go looks like that in Fig. 4-13, but upside down. As 
go approaches a cusp from the left, the value of the probability / is approaching 1, 
and the first term on the right of (4-86) shows that the slope dQ d /dg of that branch 
of the curve is positive. To the right of a cusp, / is increasing from 0, and the second 
term of (4-86) shows that the slope is negative. Each cusp must therefore represent a 
maximum of the detection probability QAgo) as a function of the quantization level 
go, and a pure strategy maximizes the detection probability for a fixed false-alarm 
probability. 

A distributed-detection system may be set up as just described, with quantiza- 
tion levels gko equal to a common value go, and with a fixed decision level m on 
the number m of l's reaching the fusion center (f = 0), but the signal strengths at 
the several sensors may actually be unequal. The probability 

qik = qikigo) = Prfe* > gol #0 

that the kth sensor transmits a 1 to the fusion center will then differ from one sensor 
to another. Let represent the quantized output of the kth sensor: 

Pr(«* = 0| H t ) = 1 - q Xk , Pr(w* = \\ #,) = qxk . 

Then the probability of detection is 
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M 

Qdim) = Z EI K 1 - «!fr)(l " 9\k) + (4-87) 

in which the summation is over the set (n,) consisting of all sequences («],«2, ■ •• , "a/) 
having more than mo l's. Algorithms for computing this detection probability as a 
function of mo have been outlined by Sarwate [Sar91]. 

A greater probability of detection can be attained if instead of using only a 
single quantization level go and sending binary digits, each sensor incorporates a 
number L — 1 of quantization levels ai, 02, ... , flt-i, transmitting a digit r if the 
datum g - f[KO] lies in the rth interval a r -\ < g 5 a r> 1 < r < L, oq = ~oo, and 
= oo. The receiver bases its decision on a likelihood ratio formed by dividing 
the probability under hypothesis if i of receiving the observed sequence of digits by 
the probability of receiving it under hypothesis Hq. Determining the optimum set 
of quantization levels a t is now a rather more difficult problem than for L = 2. 
Quantization levels maximizing the efficacy of the overall receiver were worked out 
by Kassam [Kas77] and Cimini and Kassam [Cim83]. An approach based on an 
approximation method to be introduced in Sec. 5.3.2 is described in [Hel88b]. 

4.5.2 Nonidentical independent Sensors 

We assume that the kth of M sensors processes its input v^it) to produce a datum 
gk ~ f*I?*(0] m a wav tnat mav b e optimum for detecting the signal expected to 
appear at that sensor, or it may be only an approximation to the optimum processing, 
as when a threshold statistic is generated. As before, the fcth sensor transmits a digit 
Uk - to the fusion center if gk is less than some quantization level gko and a 
digit «jt = 1 if it is greater. Again we assume that the inputs to the several sensors 
are statistically independent, but their probability density functions may now differ 
from one to another. In our effort to find the optimum quantization levels in the 
M sensors, we begin with the Bayes criterion. 

As in Sec. 1.2.1, we define prior probabilities £o and £i of hypotheses Ho and 
H\, respectively, and costs Qj attending the final decision for hypothesis H% when Hj 
is true; £o + £i = 1. The average cost of operating the distributed-detection system 
is 

C - £ [Coo Pr(- Hoi H ) + C w Pr(- Hi I H )l 

+ Ei[Cbi Pr(- Hoi HO + C n Pr(-> H,| #,)] 

as in (1-14), where Pr(— ► H t \ Hj) is the probability that the fusion center decides for 
hypothesis Hi when H y is true. Because 

Pr(-tfoiH,)= l-PrC-H,!/?;), 

this average cost can be written as 

C = £ Cqo + tiOtt + £o(C i0 - Coo) Pr(- Hi| H ) + £i(C„ - C m ) Pr(- H x \ H x ) 
= CoCoo + CtCbi - £i(Q« - C n )[Pr(- H t | Hi) - Ao Pr(-> H»| Jf )l, 

where as in (1-17) 
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Ao - irj^ 7t \ ■ (4-88) 

Because C | > C\ \ and Cio > Coo, Ao > 0; and minimizing the average cost is equiv- 
alent to maximizing the function 

F = Pr(- Hi\Hi) - A Pr(- H x \ H ). (4-89) 

We strive to do so by appropriately designing the strategy by which the fusion center 
processes its input u = (u u u 2 , ... , u M ) and by optimally setting the quantization 
levels gko at each sensor. 

From what we learned in Sec. 1.2.5 we can state immediately that the optimum 
strategy for the fusion center is to form from its input the likelihood ratio 

A(u) , 

where Pr(u| ff t ) = Pr(w l5 « 2 , ... , u M \ Hi) is the probability under hypothesis H t of 
its receiving from the sensors the sequence u of 0's and l's, / = 0, 1. The fusion 
center decides for hypothesis H\ whenever A(u) > A ; otherwise it chooses hypothesis 
Ho that no signal is present. Because the inputs to the sensors are statistically 
independent, this likelihood ratio and the decision rule can be written as 

A(u) -np7^) <Afl ^^^. (4-90) 

We determine the optimum strategy for each sensor under the assumption that 
the fusion center and the rest of the sensors adopt their optimum strategies. The 
result will be a necessary condition for the overall system to be optimum, but the 
average cost so attained may be only a local and not a global minimum [Rad62]. 
Under hypothesis H if i = 0, 1, the probability density function of the datum g k in 
the fcth sensor is ptk(gk), and the probability that it sends a 1 to the fusion center is 

J .00 
Pik(gk)dg k , i = 0, 1. (4-91) 

Let (u.) again denote the set of sequences u for which the fusion center decides for 
hypothesis H\. Its probability of doing so under hypothesis Hj can be written much 
as in (4-87): 

M 

G/(go) = Pr(- Hi\H,) = £ f] {(1 ~ Uk )[l - g ik (g k0 )] + u k q ik ( gk0 )}, (4-92) 

where g denotes the set of quantization levels {g-* }, \ <k < M. 

Fixing the decision rule at the fusion center as in (4-90), we maximize 

F = Gi(go) - A 6fl( g0 ) 

by varying each quantization level g k Q, setting the M partial derivatives of F with 
respect to the quantization levels equal to zero: 

^L = ^M - Ao ^ = , lik£M , (4 . 93) 
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By (4-91) and (4-92) 

= V (1 - 2u k )p ik (g k(1 ) fl {(1 - - fftoifeio)] + U«4fa(S«>o)} 
g *° « (4-94) 
= X ~ ^ftCaw) Pr(u fc | ft), i = 0, 1, 

to 

where u* denotes the set (w l5 ... , u k - U u k+i , ... , u M ) of all the O's and I's received 
by the fusion center in a given trial, omitting the one from the fcth sensor. The 
probabilities figuring in (4-94) are 

Pr(u*i Hi) = f[ Pr ^' H & * = °' l > 

and 

Vr(uJ Hi) = (1 - Um) Vv(g m < g mQ \ Hi) + u m Pr(^ m > g m0 \ Hi) 

= (1 - U m )[\ ~ qimigmo)} + 14»4fr»(gm0). I = 0, 1. 

We furthermore define the two sequences 

= (u\, ... , wt_i, % = Mt+i, ... , «m), 3 = °» !• 

In (4-94), 1 - 2% = 1 for u k - 0, and 1 - 2u* = -1 for u k = 1. If both 
sequences u§ and uf lie in the set (ja.) of sequences causing the fusion center to 
decide for hypothesis Hi, therefore, the terms corresponding to these sequences will 
cancel from the sum over (u,) in (4-94). The only terms remaining are for sequences 
u in which u* lies in (u.) and uj does not. We designate that set of sequences by 
(ii k ). These are the sequences u in which the kth digit w* is decisive in the sense 
that if u k = 1 the fusion center decides for hypothesis H\, and if u k =0 it decides 
for hypothesis Hq. The quantization level g k o in the kth sensor does not depend on 
the probabilities of sequences u in which the fusion center disregards digit u k . 

In this way (4-93), with (4-94), becomes 

Pikigkd) X Pr(u fc | HO - AoPok(gko) £ Vitfi &o) = 0, 
in) 0**) 
and the quantization level g k o in the kth sensor is determined by the equation 

I Pr(u fc |tf ) 

where 



(w) 1 £ k < M, (4-95) 



a („ \ - Pikigk) 

Atfo) " M 

is the likelihood ratio for the feth sensor. The M equations (4-95), along with (4-90), 
determine the quantization levels gko in each sensor. Alternative derivations of these 
equations are to be found in [Sri86] and [Hob89]. 

In order to interpret these equations, let us describe an iterative procedure for 
calculating the quantization levels leading to minimum Bayes cost for a given set of 
costs and prior probabilities, that is, for a given ratio Aq (4-88). Given a trial set 
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go = igko) of quantization levels in the M sensors, one's computer runs through all 
2 M possible sequences u = (u u u 2 , ... , u M ) of O's and l's and assigns them to Ha or 
Hx according to whether the likelihood ratio A{u) in (4-90) is less than or greater 
than the ratio A in (4-88). It then uses (4-95) to evaluate a new quantization level 
gfco for each sensor. 

Denote the right side of (4-95) by R k (g ), A stable way of determining the new 
set of quantization levels, we have found, is to solve the M equations 

Akig'ko) - -Kfc(go), k = 1,2, ... ,M, 

for the M quantities g ! k0 , and then to take the Ath new quantization level as the 
average of g' k0 and the previous level g k0) 

gkQ *~ {(gko + do)- 

With the new set of quantization levels, the computer reclassifies all 2 M se- 
quences in accordance with (4-90), recalculates the quantization levels by (4-95), and 
continues thus until the latter cease changing significantly. It can, for instance, cal- 
culate the quantity F in (4-89) at each stage and stop when its increase becomes 
inconsiderable. That quantity is 

F = &<Ao)-A o 0o(Ao), 

and the false-alarm and detection probabilities involved are Qo(A ) = Pr(— H x \ H ) 
and Q d (A ) = Pr(-> HA Hi), where as in (4-92) 

M 

Pr(- H x \ Hd = £ 11 Pr W i = 0, 1, (4-96) 

with the summation over the set (jj.) of all sequences u leading the fusion center to 
decide for hypothesis H\ . 

If one has adopted the Neyman-Pearson criterion that the probability Q d 
of detection shall be maximum for a preassigned false-alarm probability go, one 
can search for the value of the parameter A for which the false-alarm probability 
Go(A ) = Pr(~* H\ \ H ) takes on that preassigned value Q Q . For each value of A 
during the search, the computer must work out the optimum set of quantization 
levels g k0 . We now consider how to initiate this search. 

Let us suppose that the inputs to all the sensors have the same statistical 
character, but differ only in their input energy-to-noise ratios S\, S 2 , ... , S M . In 
order to determine an initial set g of quantization levels and an initial value of 
A , one pretends that all these input ratios are equal to a common value S and 
that all the data g k are identically distributed. Then all the quantization levels will 
initially be equal to a value g that can be calculated by the method described in 
Sec. 4.5.1. The fusion center will in that situation decide for hypothesis H\ if it 
receives more than some number m of l's and for H if it receives m or fewer. The 
subsequences u* in (4-94) for the sequences u lying in the sets (u*) have exactly m 
l's and M - \ - m O's. In the notation of Sec. 4.5.1, (4-95) yields for the initial 
value of A 

A = Pi(8o)MgoTV " q\(go)] M - ] -"'° 
° Po(go)[qo(gQ)] m °[\ - ?o(So)^~ , ~"* , 
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with £,( •) given in (4-80); pi(g) is the probability density function of the datum g 
in each sensor, which is temporarily assumed to be the same in all (i - 0, 1). The 
subsequent search seems to converge most rapidly if the common input energy-to- 
noise ratio 5 lies roughly in the middle of the set S\, S 2) ... , S M . 

With these initial values of Ao and gko = go, I < k < M> one restores the 
original distributions of the M data g* and uses the procedure based on (4-90) and 
(4-95) to calculate a new set of quantization levels gko, and one evaluates its false- 
alarm probability Qq(A.q). Next one increases Ao by some small amount s and by 
the same method computes Qo(Ao + e). The secant method then determines a new 
value of Ao through 

A *- A + e — (4-V7) 
Go(Ao) - go(Ao + e) 

[Pre86, pp. 248-51]. This process is repeated until the values of Ao cease changing 
significantly. The probability of detection is the final value of Pr(— » H\\H\) as in 
(4-96). 

The writer has tried this method for signals suffering Rayleigh fading, for which 
the datum gk has density functions and complementary cumulative distributions 
given by 

POkig) = e- 8 U(g), - e~s°U(g Q ), 

1 (4-98) 

PXkig) = a k e- a 'SU(g\ qxkigo) = e~<**°U(g Q ), a k = 

As many as nine sensors were included, and a small variety of input energy-to-noise 
ratios were tested. It was found more efficient to work with the variables In Qo 
and In Ao in the secant method (4-97) than with Qo and Ao. The method converged in 
all cases, but no guarantee can be furnished that it will always do so. If it goes awry, 
some other method of solving the equations (4-95) must be sought. The number 2 M 
of sequences that must be taken into account, and thus also the required storage 
and the computation time, increase exponentially with the number M of sensors. 

This method produces only a local maximum of the detection probability Qd 
for a given false-alarm probability Qo. To what set of quantization levels it converges 
may depend on the initial values chosen for them. In Table 4-1 we list three sets that 
satisfy the optimization equations (4-95) for Qo = 0.001, along with the resulting 
false-dismissal probabilities Q\ - 1 - Qd. Under each set is a list of the sequences 
of digits that cause the fusion center to decide for hypothesis Ho. The digits are 
in the same order as the sensor input energy-to-noise ratios in the top half of the 
table. The first solution resulted from taking all input energy-to-noise ratios equal 
initially to S* = 34, the second from taking them equal to 20, and the third from 
taking them equal to the largest energy-to-noise ratio Si = 40. The solution in the 
first column yields the highest probability Qd of detection. In this example three of 
the seven sensors have much weaker inputs than the rest. 

Equations corresponding to (4-95) when the inputs to the sensors are not sta- 
tistically independent are to be found in [Hob89], but no procedure for solving them 
has been recommended. The weak-signal approximation has been utilized in [Blu92] 
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Table 4-1 Distributed Detection (Q - 0.00 1) 



Sk Quantization levels 



40 


8.7296 


5.0098 


8.6046 


38 


8.7360 


4.9934 


8.6110 


36 


8.7433 


4.9776 


8.6181 


34 


8.7513 


4.9624 


8.6261. 


14 


8.9608 


4.9075 


4.8105 


12 


9.0209 


4.9312 


4.6621 


10 


9.1056 


4.9726 


4.4784 


Q\- 


2.2704 (-4) 


2.6544 (-4) 


3.7437 (-4) 


Sequences leading to Ho 




0000000 


0000000 


ooooooo 






1000000 


0000 100 






OlOOOOO 


0000010 






00 I 0000 


0000001 






0001000 








0000100 








0000010 








0000001 





to obtain a strategy for distributed-detection systems with a small number of depen- 
dent inputs. 

Problems 

4-1. A binary communication system transmits 0's and l's by sending a pulse at carrier 
frequency Sl } for each and a pulse at carrier frequency fl 2 for each 1 . The receiver 
has two narrowband inputs 

vi(t) = Re exp /fi,r and v 2 (t) = Re V 2 (t) exp iCl 2 t, 

both observed during an interval (0, T) and corrupted by statistically independent white 
noise processes 

- Ke Ntit) exp i(l t t and n 2 (t) - Re N 2 (t) exp ifl 2 t 

with unilateral spectral density N. 
Under hypothesis //© 

v i(t) ~ «,(/) + A Re F(t) exp(/ft,/ + ity), v 2 (t) ~ n 2 (r); 

under hypothesis H\ 

Viit)- «i(0, Ht) = n 2 (t) + A Re F(t) exp(/n 2 ? + ty). 

The complex envelope F(t) is known to the receiver, but the phase 4» is unknown and 
uniformly distributed over (0, 2-jt). The signals are subject to Rayleigh fading; that is, 
the signal amplitude is a random variable with a Rayleigh distribution 

z(A) = 4 V{A). 
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How should the receiver process its two inputs in order to decide between hy- 
potheses H and H\ with minimum probability P e of error? Assume that O's and l's 
occur with equal relative frequencies. In terms of a suitably defined average energy-to- 
noise ratio, calculate that minimum probability of error. 

4-2. Use (3-106) and the method of Problem 3-11 to derive (4-18). 

4-3. Define an effective signal-to-noise ratio for the statistic U by 

_ 2 _ [j?(^| ifr) -.E«/| flo)] 2 

D ° vnTu ~~ 

as in (4-72), where Var U is the variance of U under hypothesis # - Determine the 
effective signal-to-noise ratios Dl and S>\, for the statistics U and U' of (4-19) and 
(4-21), respectively, when the signal-to-noise ratio of the radar echo signal actually 
present in the kth interval is 

4 = d 2 m ~ 6o), 

and/( ■ ) is the combination of the antenna gain patterns on transmission and reception. 

Use Schwarz's inequality to show that D v < > Da. 
4-4. Evaluate the effective signal-to-noise ratio D\* defined as in Problem 4-3 for the statistic 

g' of (4-5) and determine the coefficients a m for which it is maximum. 
4-5. As in Example 1-3, a receiver must decide whether a noise source is present on the 

basis of n independent measurements of a datum v that has a Gaussian distribution 

with expected value 0, In the absence of the source (hypothesis Ho) its variance is 

Nt, and in its presence (hypothesis Hi) the variance of v is N\ ~ Nq + S, S > 0. The 

receiver bases its decisions on the statistic 

J'i 

The optimum statistic is G t . Calculate the asymptotic relative efficiency of a receiver 
utilizing versus a receiver utilizing G\ for integral values oik > 1. Evaluate it for 
k = 2,3, 4. 

4-6. Under hypothesis Ho the datum g has the probability density function p a (g) = \ exp 
Hg|); under H s it is p\(g) - \ exp(-|g - at), where a > represents a signal. On 

the basis of M independent data g u gi gM of this kind, a receiver is to decide 

between these two hypotheses in accordance with the Neyman-Pearson criterion. Find 
the optimum statistic for this decision as a combination of the data, and calculate 
its effective signal-to-noise ratio as defined in (4-66). Compare this receiver with one 
basing its decisions on the statistic 

M 

Calculate the asymptotic relative efficiency of this receiver with respect to the optimum 
receiver. Be careful in passing to the limit a -» 0. 
4-7. Determine the efficacy, as defined in (4-69), for a detector that utilizes the statistic 



8 



<<") = 



m > 0, 



where v(t) = Re V(t)expiSU is the input to the receiver and s(t) = ReS(*)exp 
(i&t + »>) is the signal to be detected. The noise is white and Gaussian. For the 
necessary moments use (C-7) of Appendix C. Find the asymptotic relative efficiency 
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a.r.e. 2:m of a receiver utilizing a quadratic detector (m - 2) with respect to a receiver 
using an /wth-law detector (m * 2). Calculate it for m ~ 1, 3, 4, and 5 and observe 
that it is greater than 1. 

4-8, In a certain diversity communication system there are M receivers that simultaneously 
pick up the output of a single transmitter sending messages coded into equally probable 
O's and l's. For a the transmitter sends nothing; for a 1 it sends a narrowband pulse 
with complex envelope F(t). The digits appear every T seconds, and the signals are 
confined to intervals of duration 7*. Each receiver picks up a common "specular" 
component of the signal, which has a known amplitude and is the same in all the 
receivers, and an independently randomly scattered component, whose amplitude has 
a Rayleigh distribution with parameter cr 2 as in Problem 4-1. Thus the input to the itth 
receiver when a 1 was sent is 

Vk(t) = «*(0 + B Re F(t) exp(iftf + />) + Q Re F (t) exp(iOr + 
k - 1,2 M, 0<t <T, 

and the probability density function of Q has the Rayleigh form in Problem 4-1. 
The phases <j> fc , 1 < k < M, and »J> are all independent and uniformly distributed over 
(0, 2ir). The random amplitudes C u C 2 , ... , C M are independent of each other and of 
the phases. The noise inputs n k (t) are independent, white, and Gaussian with unilateral 
spectral density N. . 

(a) Work out the optimum way of processing and combining the M inputs vi{t), 
V2(t)> ••■ » v»t(t) in order to decide with minimum probability of error whether a 
or a 1 was sent. Draw a block diagram of your system. Hint: It is simpler to 
work with the real and imaginary parts of C k exp i$ k , k = 1,2, ... , M. What is 
their joint probability density function? 

(b) Determine the threshold statistic G for this decision problem under the assumption 
that the signal amplitudes are very small. Assume that the ratio <x 2 /B 2 of the 
scattered component to the specular component is known and fixed. Calculate 
the effective signal-to-noise ratio, defined as in (4-72), for the statistic. G. Here 
E(G\H\) includes an average over the distributions of the amplitudes and the 
phases of the received signals. Hint: Use (4-75). 

4-9. For ail four Swerling cases, derive the probability density functions z(S) of the total 
energy-to-noise ratio, as given in Sec. 4.2.4, from those given for the individual signal- 
to-noise ratios d. Calculate the expected values and the variances of the statistic V in 
(4-19) in all four cases. 
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Evaluating Signal 
Detectability 



5.1 THE EDGEWORTH SERIES 

5.1. 1 The Moment-generating Function 

Radar receivers, as we have seen in Chapter 4, often base their decisions 
presence or absence of a target on multiple inputs, the noise in which is si in ■■ 
independent from one to another. Most commonly the detection statistic, \ i 
now denote by G', is the sum of statistics gj formed from each input, 

G' =g\ + g 2 + -+gj + - + g M ; 

the statistic gj is a functional 

of the yth input, and M is the number of inputs. Indeed, if signal parameUi 
carrier phase and amplitude are independently random from one input to .n 
the logarithm of the average likelihood ratio has the form (5-1), and in Sec i 
saw that a particular threshold statistic also takes this form. A diversity o n 
cation system in which a symbol 1 invokes the simultaneous transmission >>i 
in disjoint frequency bands likewise decides about the transmitted message .li - 
the basis of a statistic such as G'. The components gj of G' are often imU -\ - 
random variables because of the independence of the noise in the seven 1 1 r. 
They may lose that statistical independence, however, if the signal compon. ■■>■ 
pend on a common, but random parameter such as the amplitude A in s 
cases 1 and 3 (Sec. 4.2.4). 
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The larger the number M of inputs, the more tedious it becomes to calculate 
hil.se-alarm and detection probabilities 

J- /-CO 
Po{g)dg> Q d = P,(g)dg 
G JG 

■i receiver that decides that a signal is present when the statistic G' exceeds a 

i .ion level G. (We can use any letter we like to designate our variable of inte- 
ium. and we shall often use g in this context.) The algorithms in the Appendix, 

t\3, for calculating the Mth-order Q function, for instance, take a long time 

ii M is large and may require complicated stratagems in order to avoid overflow 
I underflow in the process. For some detectors it may not even be possible to 

■ ■i mine the density functions • ) and P\{ ■ ) of the statistic G' under hypotheses 
iikI Hi in closed form. 

Methods for calculating tail probabilities such as Qo and Q d will now be pre- 
ial that are generally the more accurate, the larger the number M of independent 
• lorn variables gk, yet whose computation times are roughly independent of M. 
- u-nee to a particular hypothesis will be omitted. We shall assume that the 
nu iii-generating function 

h(z) = E{e~ G ' z ) = C P(g) e-f dg (5-2) 

J-oo 

'Ik- statistic G' is an analytic function in the complex z-plane. The moment- 
filing function is the Laplace transform of the probability density function 
' G'. It was given this name because the coefficients of its Taylor expansion 

■hi i lie origin, 

h{z) = £ E{G*)£Z£ (5-3) 

i 'i i iportional to the moments of the random variable G' . The moment-generating 
M«'it is assumed to be regular in a vertical strip that contains the imaginary 
mk! has finite or infinite width; we deal only with random variables all of 
moments are finite. This strip, which is called the regularity domain, can be 

= ■ .fined by 

ci < Re z < c 2 , c\ < 0, c 2 > 0. 

i lie imaginary axis, Re z s 0, z ~ -iw; and is the familiar character- 

t MiK -lion, that is, the Fourier transform of the probability density function P(-) 
i-'f. pp. 121-3], [Pap91, pp. 115-20]. When, as often, G' is a positive random 

■ ii>l>-. (lie regularity domain includes the entire right half-plane, for then 

•I f P{g) e'* dg\ < f P(g) e -* x dg < f P(g)dg = 1, x - Re z > 0, 
. » I Jo Jo 

1 iiifitinrilies occur only in the left half-plane. 

When the statistic G' is a sum of M independent random variables, the 

= "■■111 generating function of G' is the product 

M 

h(z)=Y\xk{z) 

k=\ 
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of the moment-generating functions 

%(» = E[exp(-g k z)) = f p k (g)e~ sz dg 

J — 00 

of the statistics g k = fe(03; Pk(-) is the probability density function of g fc . 
5.f-2 The Gram-ChaHier and Edgeworth Seriem 

When, as we shall now assume, the random variables gy in (5-1) are independent 
and identically distributed, their sum G' possesses a distribution that by virtue of 
the central limit theorem is the more nearly Gaussian, the larger the number M 
[Hel91, pp. 260-5], [Pap91, pp. 214-21]. We describe a method that takes advantage 
of that asymptotic behavior of the probability density function of G 1 . It requires 
knowing only the moments of the random variable G', or of its components q , and 
it does not require the characteristic function or the moment-generating function to 
be available in closed form. 

The probability density function of a random variable such as G' is the inverse 
Laplace transform of its moment-generating function, 

The vertical contour of integration can lie anywhere within the regularity domain of 
the Laplace transform (5-2): c\ < c <ci. 

Into (5-4) we shall introduce the logarithm of the moment-generating function 
h(z) expanded in a series of powers of (-2), 

lnA(z) = ~Gz + icrV + £ (5-5) 

where G = E(G') is the expected value of the statistic G' and cr 2 = Var G' is its 
variance. The quantities k* are called the cumulants or semiinvariants of G', and 
In h(z) is called the cumulant-generating function. The reason for the name "cumu- 
lant" is that k* for a sum G' of independent random variables equals the sum of 
the cumulants k* ' for the components gj of the sum. The first cumulant kj is the 
expected value; the second cumulant K2 is the variance. Later we shall see how the 
coefficients of the expansion in (5-5) can be calculated from the moments £(<?'*) of 
G', k ~ 1.2 

The moment-generating function h(z) of G' is written as the exponential func- 
tion of the right side of (5-5): 

h(z) = exp(-(?z + £oV) exp r(z), 
r(z) = X «*-TT- 

When we write exp r(z) as a power series in (-z) by using the Taylor expansion of 
the exponential function, and when we then collect terms with like powers of (-z), 
we obtain the power series 
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= 1 + 5>(-z)». (5-7) 
The first few coefficients are given by 

and so on. An algorithm for computing these from the cumuiants will be presented 
later. 

After h{z) in this form is substituted into the contour integral in (5-4), we 
obtain for the probability density function of G' 

p{g) = nj expfe " ° )z + ^ [ ] + i Cki ~ z)k ~\ ||- < 5 " 8 > 

We shall now evaluate this inverse transform term by term. 

The moment-generating function of a Gaussian random variable with expected 
value G and variance a 2 , 

is exp(-(?z + \<j 2 z z ), and we can therefore write, by (5-4), 

<j><^)= * (5-9) 
v2tt 

is called the error function. Differentiating k times with respect to G, we find 
where 

4f*\y)±(-\f£fiF>{y). (5-10) 
Thus we can write (5-8) for the probability density function of G' as 

, (?)% V(£^) + i|^,(^), (5 . u) 

which is known as the Gram-Charlier series. It represents the density function as a 
Gaussian density function plus a sequence of correction terms. 
The functions <j> ( * } ( y) are often written as 

where /i*( y) is the kth Hermite polynomial, 

A<K»=-i,. *i(jO = J', *2(j') = y 2 -l, 
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where 



J" £'+?'<» 



and so on. The functions <& {k) (y), like the Hermite polynomials, are subject to the 
recurrent relation 

4> {k+l \y) = y$ (k \y) - W (k - l) (y), (M2) 

which follows easily from Leibnitz's rule for differentiating a product: 

In analyzing receivers we most often need not the probability density function 
of the decision statistic G', but its complementary cumulative distribution 

q + (G) = Pr(G' > G) = fp(g)dg, (5-13) 

JG 

which furnishes us with the false-alarm or the detection probability, as the case may 
be. When we substitute the series (5-11) into (5-13), the first term yields an error- 
function integral as defined in (1-11). Integrating the remaining terms reduces the 
order of the derivative 4> (k) by 1, as can be seen from (5-10), and we obtain 

?+(G) = erfc (^) + i £*<*-»(^) (5-14) 

for the complementary cumulative distribution of the statistic G'. The cumulative 
distribution is similarly 

<?-((?) = ]"%&)<& = 1 -<?+«?) 

lS •' 5, 

The functions \ - erfc y and (-1)* $ k \ y) have been tabulated for use in (5-1 1) 
through (5-15), [Har52], [Abr70, pp. 966-74]. The recurrent relation (5-12) enables 
us to calculate the functions <b <k) ( y), k > 1 , one after another, starting with <i> <0) ( y) 
and $ {l \y); a calculator can easily be programmed to do so, and the tables are 
unnecessary. For computation of erfc y see [Hel91, pp. 592-6] or [Abr70, pp. 932-3], 

When the moments of the components gj of the sum G' in (5-1) can be calcu- 
lated, but not their moment-generating function, the coefficients of the series for 
In h(z) in (5-5) can be determined in the following way. Denote the moment- 
generating function of each component gj by ti(z). Then 



„(.) = ! + f>(-*)», «* = ^ 
and we need the coefficients b,„ ~ K%/m\ in the power series 
f> m (-z) m = m l + f>(-z)* . 

m-l L *=> J 
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They can be calculated recursively by the algorithm 

h - a k - ~ £ mb m a k ^ m , (5-16) 

which is easily programmed for a calculator or a computer. In particular, 
b\ = au 

h~a 2 - \b\a\ - d2 - \a\, 
h ~ a 3 - \(b { a 2 + 2b 2 a\) - a 3 - a\a 2 + \a}> 
and so on. To derive (5-16), differentiate 



1 + £ a k x k - exp £ 6 m x'" 



with respect to x, 



X fa***" 1 = £ m^x"- 1 1 + £ 



and equate the coefficients of x k ~ l on both sides of the equation. 

The coefficients b m are then multiplied by M to obtain the coefficients of the 
series in (5-5); we denote them by b k = K k /k\. In particular, 

E(G f ) = b\ = Ma u Var G' = 26$ = 2M> 2 , ^ = M*. 

To find the coefficients c k in (5-11) and subsequent equations, one forms the expo- 
nential function of that series, but with the first two terms set to zero: 



1 + I ex P ||; «H)'» ]. 



Here - b" = ci = c 2 = and = = M>,„, w > 3. The recurrent relation 
in (5-16) again applies, but in the form 

1 

c * ~ b " + I Z mb m c k-m> (5-17) 

Because the summations in (5-16) and (5-17) are the same, they can be programmed 
as a subroutine to be employed in both algorithms. Taking these with the recurrent 
relation (5-12) for the derivatives of the error function, one can set up a program for 
computing the Gram-Charlier series in a computer or in a programmable calculator 
with sufficient program and data memory. 

Taking the terms of (5-14) in their natural order is not the most accurate way 
of summing that series. When as here the statistic G' is the sum of M independent 
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and identically distributed random variables gj, the cumulants in (5-5) have a 
common factor M: 

Kk = MkJ, ct 2 = Mai, 

where oo is the variance of gj and k? is its Ath cumulant. We then find upon working 
out the coefficients c[ = ca/o* in (5-14) that c'$ is of order M~^ 2 , c' 4 and 4 are of 
order A/ -1 , 4, cj, and <?9 of order JV/" 3/2 , and so on. Keeping together terms whose 
coefficients are of the same order of magnitude in M amounts to rearranging the 
series into the form 

q + (G) = erfcjF + c^\y) + [c'^\y) + c'^\y)} 

+ l^Ky) + ^\y) + 4<T( v)] + ~, y = (5 ' 18) 

The subscripts on the coefficients c' k in the remaining groups follow the pattern 

(8, 10, 12), (11, 13, 15), ... , (3k - 1, 3k + 1, 3k + 3) Rearranged in this way 

the series (5-18) is known as the Edgeworth series. In evaluating it one should 
include all terms of a given order in M if one includes any of them. The series 
in (5-11) and (5-15) are similarly rearranged. Further details about the Edgeworth 
series can be found in [Fry65, pp. 257-64]. 

Although the coefficients c' k = Ck/o* generally decrease in absolute value from 
term to term, the functions § {k \y) after a certain value of k begin to increase 
rapidly in magnitude. This behavior renders the computation of the Edgeworth 
series somewhat tricky. It is really divergent and behaves like what is called an 
"asymptotic" series, converging up to a certain point and then going awry. The 
safest way to evaluate (5-18) seems to be to print out the sum after the triplet of 
terms in each bracket has been added in. These sums will at first stabilize, but 
later begin to diverge. Take as q+(G) the value at which the sum stabilizes. The 
magnitude of the last triplet included indicates the order of magnitude of the error. 
The Edgeworth series is the more reliable, the larger M is and the smaller the value 
of \y\ is, that is, the closer the value of G lies to the expected value E(G') of the 
decision statistic. 

Let us see how to apply the Edgeworth series to evaluating the false-alarm and 
detection probabilities for the statistic G' = V in (4-19) characterizing the quadratic 
threshold detector of Sec. 4.2.1. From (4-24) we find for the logarithm of the 
moment-generating function 

In h(z) = -M ln(l + z) - = MV + sf (-z)*, S = \D 2 , 

1+2 * fc=i 

and by comparison with (5-5) we find under hypothesis H\ that 

E(G') = G = M + S, 
Var G' = a 2 = U + 25, 

k k\ k* 



y - 



G - M - S 

y/M+2S 
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where G ~ U is the decision level on the threshold statistic U. Here the step 
represented by the recursion (5-16) is unnecessary; we already have the cumulants in 
simple form. 

In Table 5-1 we show the results of a computation with the Edgeworth series 
for M ~ 50 and G = Uq = 91.0634. The columns are headed with the subscripts 
on the coefficients c' k in the last group of terras added in. The values calculated by 
the algorithms of the Appendix, Sec. C.3, are listed in the column headed "Exact." 
Each value in the table is to be multiplied by 10 raised to the power listed in the 
second column. Thus the last column of the row marked D = reports the false- 
alarm probability as 9.99995 • 10~ 7 . Comparison shows that only for values of 
the decision level G = U near the expected value of the random variable G 1 - U 
can one rely on the accuracy of the Edgeworth series. Because in assessing the 
performance of receivers we are usually concerned with false-alarm probabilities Qq 
and false-dismissal probabilities Q\ ~ X - Q d that are small, we turn in the next 
section to a method that is most accurate and most expeditious for values of the 
decision level G far in one tail or the other of the distribution of the decision 
statistic. 



5.2 NUMERICAL LAPLACE INVERSION 

In Sec. 5.1 we developed expressions (5-14) and (5-15) for the tail probabilities of 
the decision statistic G' that consist of error-function integrals plus correction terms 
involving the derivatives of the error function. We found these to be unreliable and 
awkward for decision levels G far from the expected value E(G'). 

One might also compute the probability density function of the statistic G' 
numerically by taking the inverse Fourier transform 



of its characteristic function by means of the fast Fourier transform algorithm. Tail 
probabilities could then be computed by integrating P(g) numerically. When the 
decision level G on G 1 is far from E(G% however, determining tail probabilities 
accurately from the result of a numerical Fourier transformation would be difficult, 
for one would have to evaluate the characteristic function h(~im) at a large number 
of closely spaced sample values of o> in order to avoid aliasing, which is most dele- 
terious in the tails of the transform P(g). The computation would be lengthy. We 
therefore concentrate here on methods that are most accurate in the far tails of the 
distribution and do not require determining the entire probability density function 
of the statistic. 

5.2.1 Integration through a Saddlepoint 

The probability density function of a random variable such as G' is the inverse 
Laplace transform of its moment-generating function as in (5-4). The vertical con- 
tour of integration in that expression can lie anywhere within the regularity domain 
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of the Laplace transform (5-2): c\ < c < c 2 . If we put it to the right of the imagi- 
nary axis, < c < c 2 , we can integrate (5-4) from -oo to G, whereupon we obtain 
the left-hand tail probability 

q~{G) = f P(g)dg = p"*<£> e o-~ & o<c<c 2 , (5-19) 

J~oo Jc-i<x> Z 27TZ ' 

which is the cumulative distribution function of G'. If, on the other hand, we put 
the contour to the left of the imaginary axis, but still within the regularity domain, 
c\ < c < 0, we can integrate (5-4) from G to oo, obtaining the right-hand tail prob- 
ability, or the complementary cumulative distribution 



( c Hz) G - dz 
Pig) dg=\ ~e G ^—, Cl < c < 0. (5-20) 

The tail probabilities can often be calculated by integrating (5-19) or (5-20) numer- 
ically. 

The vertical contour of integration can be placed anywhere in the strip we have- 
called the regularity domain of the moment-generating function h{z). A location 
favorable for numerical integration can be found by considering the appearance of 
the integrand 

H(x) = ^ e Gx = = E{exp[x(G - G') - In *]} 

of (5-19) for real values x of z - x + iy in < x < c 2 . This function H{x) is 
a convex U function of x in this range. As an illustration, we have plotted in 
Fig. 5-1 the function \H{x)\ for the moment-generating function for the A/th-order 
Q function given in (4-24). 

A function f(x; G') is said to be convex U in x if for any e in (0, 1), with 
e' = 1 - e, 

/(exi + e'x 2 ; G') < ef{x x ; G') + e'f(x 2 ; G'), 

If twice differentiable,/(jc; G') has a nonnegative second derivative/"^; G') at all 
values of x in the region. Here G' may be a random variable on which that convex 
function happens also to depend. If we take the expected value of both sides of that 
relation with respect to the distribution of G', the sense of the inequality does not 
change: 

Eif(s X] + e'x 2 ; G')] < eE[f(x i; G')] + e'E[f(x 2 ; G% 

The convexity of H(x) follows from the obvious fact that for any value of a 
random variable G', x(G - G') - In x is convex U, and from the easily demonstrated 
fact that the exponential function of a convex function is convex U: if f{x) has a 
positive second derivative, so does exp/(x). Taking the expected value with respect 
to the random variable G' preserves this convexity. A similar argument shows that 
the integrand of (5-20) is convex U in c\ < x < 0. The integrands of (5-19) and 
(5-20) thus possess a single minimum when the complex variable z moves along 
the real axis within the convergence strip of the Laplace transform in (5-2). These 
minima occur at points zq on the Re z-axis, where zq is a solution of the equation 

* (Z) = ~~dz~~ = °' Z = 20, Cl <Z0< Cl - (5 " 21) 
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Figure 5-1. !#(#)( for the moment-generating function of the Mh-order Mar- 
cum Q function: M - 5, S = $£> 2 « 6/11, G = 40/11. 

It is shown in the theory of complex variables that if an analytic function has 
a minimum at a point zo as z passes through it in one direction—here along the real 
axis— it has a maximum there as z passes through it in the perpendicular direction. 
If we run our contour of integration in (5-19) through the point z = zo, the integrand 
will therefore decrease in absolute value as the point z ~ z^ + iy moves up or down 
the contour away from the real axis. Because of the shape of the surface \H{z)\ 
over the complex z -plane in the neighborhood of z = zo, this point z is called a 
saddlepoint or col. 

In this region the integrand H(z) can be written in the form 



where 4»(z) = In #(z), <&'(z ) = 0, and $"(zo) > 0. The integrand H(z + iy) there- 
fore has a Gaussian behavior as a function of y in the neighborhood of the sad- 
dlepoint; we shall take advantage of it in Sec. 5.3.2. Had we placed our contour 
at any other point z\ within the strip < Re z < C2, the exponent in (5-22) would 
have contained a term Q'{z\){iy), z = zj + i>, which would cause the integrand to 
oscillate; and when integrating (5-19) numerically, we should have had to space our 
samples closely enough to follow the oscillations accurately. By taking the contour 
through the saddlepoint, oscillations of the integrand are pushed out to the region 
where its absolute value is small. 
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# (zo + iy) 



exp[<I>(zo)-i<nzo)v 2 



(5-22) 




For these reasons we set c = z M in (5-19); where zf is the solution of (5-21) 
lying to the right of the origin. Into the integrand of (5-19) we now put z = + iy. 
The imaginary part of the integrand is an odd function of y and integrates to zero. 
The real part is an even function of y, and we can rewrite (5-19) as 

g-(G) = i pRe exp [(2o + fy) G ]\ dy, 

it Jo L z + iy nxv " J J " (5-23) 

< z Q - zo -) < c 2 . 

For the right-hand tail probability a similar argument shows that we should put 
the contour through the saddlepoint zq +> of the function (-z) -1 A(z) exp Gz lying to 
the left of the origin, c x < c = z™ < 0; z™ is a second root of (5-21). Then (5-20) 
becomes . .«> r;/ , . . 

^---IH^^^h ( , 24) 

C\ < 20 = Zq +) < 0. 

In general it is most efficient to compute q+(G) for G > E{G') and q~(G)- 
for G < £"(G'), although for G near the expected value E{G') it does not much 
matter which one chooses. The tail probabilities can be determined as accurately as 
one likes by evaluating (5-23) or (5-24) by a numerical quadrature formula utilizing 
sufficiently many samples of the integrand spaced closely enough together. 

To solve (5-21) for a saddlepoint z , Newton's method is generally most expe- 
ditious. Starting with a trial value z<5, a new trial value zg is determined from 

where *"(z) is the second derivative of 4>(z) = In H(z). Because the value of this 
second derivative is not needed to great precision, it suffices, when convenient, to 
approximate it by calculating 4>'(z) at nearby values zq and z<5 + Sz and forming 

5z 

When Newton's method is so modified, it is called the secant method [Pre86, 
pp. 248-51]. Newton's procedure (5-25) is repeated until the value of z ceases chang- 
ing significantly. Still another way to find the saddlepoint is to use the secant method 
to search for the root x of 

Im ®(x + ie) = 4 

I —IT, C| < x < 0, 

taking e to be some suitably small number. High accuracy in determining the sad- 
dlepoint is unnecessary when (5-23) and (5-24) are to be integrated numerically. 

A convenient starting value in the search for the saddlepoint can be derived by 
approximating the logarithm of the moment-generating function by 

In h(z) * -Gz + ia 2 G z 2 , G = E(G% <x% = Var G'. 

Then the equation 4>'(z) = for the saddlepoint becomes approximately 

G-G+v^z-- =0, 
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which can be reduced to a quadratic equation whose roots are 

- g-g±[(g-g) 2+ 4qfe] l/2 

For G < G one takes the positive sign, for G > G the negative sign. One must 
check that the starting value lies inside the regularity domain. If not, a value of z 
on the same side of the origin, but just inside that domain can be chosen to begin 
the search for the saddlepoint zq by (5-25). 

Although the integrand exp 4>(z) is convex, the phase <I>(z) is not necessarily so 
everywhere, and <I>"(z) may be negative or zero for real values of z in the regularity 
domain, causing the search for the saddlepoint by Newton's method or the secant 
method to go awry. In the immediate neighborhood of the saddlepoint zq, however, 
<&"(z) must be positive. 

For integrals of this kind, the trapezoidal rule is both simple and accurate. It 
approximates a typical semiinfinite integral by 



/(v) = Reff(z + i». 

The number k F of terms is taken large enough that the final value f(kp§y) of the 
integrand is negligible. One halves the step size Sy until the result of the summation 
stabilizes to the number of significant figures desired. It is unnecessary, after divid- 
ing the intervals by 2, to recompute the values of the integrand previously computed, 
nor even to store them. Before multiplication by hy t one simply adds to the sum 
previously accumulated the values of the integrand at the new, intermediate points 
z = z + />. Details of this method of numerical quadrature can be found in Nu- 
merical Recipes [Pre86, Sec. 4.2, pp. 110-4]. Schwartz [Sch69] and Rice [Ric73] have 
treated the advantages of this quadrature formula for infinite integrals of analytic 
functions. The number of reliable significant figures roughly doubles each time one 
divides the step size 8v by 2. 

As (5-22) indicates, the width of the integrand of (5-23) and (5-24) is on the 
order of [3> 11 (zq)]~ 1/2 , and it is convenient to specify the initial interval by between 
samples of the integrand as 

8y = •n[* / '(z )r 1/2 , (5-27) 

where t\ is on the order of 1. The summation in (5-26) can be stopped at a point 
y ~ kphy where the ratio of \H(zo + iy)\ to the absolute value of the sum being 
accumulated falls below 8v times some number e chosen sufficiently small to ensure 
the accuracy desired. 

As an example, let us consider calculating the Mth-order Q function in the 
form given in (4-25), (4-26), or as in (C-19) of Appendix C: 

q+(G) = Qm(D, V2G) = Qm(S, G) 



■a 



J7\(M-l)/2 f)2 

|) e- s - u I M - t (2jSU)dU, S = £-. 
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The moment-generating function is 

by (4-24). The "phase" 4>(z) of the integrand is now 

0>(z) = in H(z) - Gz — M In(l + z) - - In 2, (5-29) 



the saddlepoint z is a root of the equation 



*'(*) - G ~ — - - ——j - - = 0, (5-30) 
1 + z (1 + zY z ' 

and the second derivative needed in (5-25) and (5-27) is 

*ju M 25 1 

Equation (5-30) is most easily solved by Newton's method (5-25), although it can 
be reduced to a cubic equation that can be solved by a computer routine for finding 
the roots of polynomial equations. 

For M - 10 and G = 32.7103, the second column of Table 5-2 ("Straight 
Path") shows the values of g+((?) for D = 0, 4, 6 and of <?_(G) for D ~ 7, 10, 12, 
as calculated by contour integration along a straight vertical contour through the 
saddlepoint z for -n = 1, 0.5, 0.25. The trapezoidal rule was utilized with s = 10~ 8 . 
The number of steps taken in the numerical integration is shown in the third column 
of the table. For all values of D the results with r\ = 0.25 agreed with those calculated 
by the recurrent algorithm of the Appendix, Sec. C.3. Bounds on the truncation error 
incurred by cutting off the numerical integration at a finite value of y are to be found 
in [Hel84d]. The results of that paper justify the simple stopping rule stated under 
(5-27). 

The probability density function P(G) can be calculated by numerical saddle- 
point integration of the inverse Laplace transform (5-4). The vertical contour is now 
taken through a saddlepoint that is the root of -t]/(z) = G, ty(z) = In h(z). Because 
of the convexity of the integrand h(z) exp Gz for real values of z, a single such sad- 
dlepoint exists on the Re 2 -axis in the regularity domain c\ < Re 2 < c 2 . Detailed 
examples were worked out by Rice [Ric80]. 

5.2.2 integration on a Curved Path 



The magnitude of the integrand in integrals of the types in (5-4), (5-19), and (5-20), 

decreases most rapidly along a contour C that is known as the path of steepest 
descent [Car66, pp. 257-66]. The magnitude of the integrand is exp[Re *(z)], and 
along the path of steepest descent Re 0(2) drops toward -00 most precipitously. 
With 2 = x + iy s the function 

f R (x,y) = Re$(x + iy) 
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Figure 5-3. Equipotentials (dashed) and flux lines (solid) in tbe neighborhood of 
a neutral point (saddlepoint) zo. These represent a conforms! map of the transfor- 
mation z - z = w ,/2 , w - 2[*(z) - <f(zo)]/<S>"(2o)- 

along that path will therefore yield the same value of g~(G) as does integration along 
the vertical path. Along the path of steepest descent the integrand remains real and 
decreases most rapidly toward zero. Numerical quadrature of the integral along that 
path would require fewest steps to attain a given precision. The same considerations 
apply to an evaluation of g+(G) by (5-20) after deforming the vertical path there 
into the path of steepest descent, the flux line Im <I>(z) s —ir, passing through the 
saddlepoint zq\ 

To integrate along the path of steepest descend however, will in most problems 
require that path to be computed numerically, and to do so would much protract 
the numerical integration. Instead we determine a path that, at least as long as the 
integrand is of significant magnitude, lies close to the path of steepest descent, but 
is simpler to specify. The simplest path seems in most problems to be a parabola 
lying symmetrically about the real z-axis and passing through the saddlepoint z . 
The only question is what curvature it should have at the saddlepoint zq. We want 
the parabola to fit the path of steepest descent as snugly as possible. 

A parabola passing with curvature k through the saddlepoint zo on the Re z- 
axis is described by the equation 

"z = zo + i^y 2 + iy, z = x + iy. (5-34) 

The curvature k of the parabola that fits the path of steepest descent most tightly is 
given by 



K = 



(5-35) 



derivatives with respect to z are again indicated by primes. In order to derive this 
formula, we expand the phase 4>(z) in the neighborhood of the saddlepoint as 

4>(z) « 0>(zo) + i*"(soX* - *o) 2 + £<f>"'(zoXz - zo? + "■■ 
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Putting z - z + x + iy, we find for the imaginary part of the phase on the path, of 
steepest descent 

Im $(z) = * ®"(z Q )xy + \&"fa)$x 2 y - y\ 

Because x - ±k>> 2 , x 2 is on the order of y*, and keeping only terms through those 
of third order in y, we find 

*"(z )* = l<t>"U)y\ 

whence, by comparison with x = ^Ky 2 , we obtain (5-35). The parabola that fits the 
path of steepest descent in this manner is called the osculatory parabola. 

On the path of integration the variable of integration z becomes that given by 
(5-34), and the element of integration is 

dz - i{dy - i dx) = i{\ - iny) dy. 

The integral to be evaluated is now, by (5-32), 

g(G) = -j o Re[e*<-->(1 - i K y)] dy, z = z + ±Ky 2 + i>. (5-36) 

Again this is most expeditiously integrated by the trapezoidal rule with the step size 
chosen as in (5-27). 

The phase <f>(z) contains a term - In z, which puts a term -2zq 3 into <&'"(zo) 
in (5-35). When the decision level G is close to the expected value E(G') of the 
statistic, and when the saddlepoint is zq +1 < 0, this term may dominate that third 
derivative and cause the curvature k to be positive. The osculatory parabola is then 
directed into the right half-plane, and the integral (5-36) along it diverges. In a case 
such as this, one can use instead the osculatory parabola passing through the right 
saddlepoint zq \ or one can take the curvature k equal to zero and integrate along 
a straight vertical path as in Sec. 5.2.1. Alternatively, one can drop the term - In z 
from the phase and utilize the curvature calculated from (5-35) with *(z) replaced 
by the modified phase 4>{z) = $(z) + In z. It is the behavior of the phase <3>(z) far 
from the origin that determines the rate of convergence of the integral in (5-36), and 
on that behavior the term - In z has relatively little influence. 

In computing the Mth-order Marcum Q function, for which the phase 3>(z) 
is given by (5-29), the saddlepoints and zq' lie close to the origin for the large 
values of the parameters M, S, and G for which this method is most appropriate. 
It suffices to calculate the curvature k by evaluating (5-35) for the modified phase 
<E>(z) and then setting z = 0. We obtain the simple formula 

S + (M/3) 

K = -sTjm ( 5 -") 

In the fourth column of Table 5-2 we list the results of computing the Mth-order 
Q function by numerical integration along a parabola whose curvature is specified by 
(5-37). The number of steps required in order to attain a certain precision is seen to 
be reduced by one-third to one-half of the number required when integrating along 
a straight vertical path. The larger the number M, the less is gained by utilizing a 
parabolic path, for when M is large, the value of the integrand drops off to zero 
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so rapidly as the point z leaves the saddlepoint zq that the curvature of the path is 
insignificant. 

For values of the parameters M, S, and G small enough so that the recursive 
method of the Appendix, Sec. C3, does not encounter underflow or. overflow in its 
computation, that method is recommended. Otherwise, the saddlepoint-integration 
method described here is preferred. For extremely large values of M — in the thou- 
sands or larger — , the first three terms on the right side of (5-29) tend to be of the 
same order of magnitude, and significant figures may be lost in the subtractions. It 
is best then to write the phase as 

Sz 2 

*(z) = Vz - M[\n{\ + z) - z] + — Inz, V ~ G - S - M, 

1 + z 

into which one puts 

ln(l +z)-z =~\z 2 F{X\\ 3; z), 

and one computes the hypergeometric function F(2, 1; 3; z) either by its power series 
in z or, as shown in [Hel92d], by its continued fraction. In ordinary practice this 
complication will be unnecessary. 

Saddlepoint integration is no panacea; there are distributions for which it is 
useless. In considering a new application, it is wise to trace a few paths of steepest 
descent for typical values of the parameters in the distribution to be computed. Rice 
[Ric73] has shown how that can quickly, but approximately, be accomplished. If z 
and z + Az are two close points on the path of steepest descent, 

Im[<D(z + Az)-*(z)] = 

or, approximately, 

Im[* / (z)Az] * 0. 

The complex increment Az must therefore be proportional to <5'*(z). If the segments 
on our approximate path are to be of length 8, the point lying next to a point z 
already computed will be located at 

W(z)\ 

One starts tracing the path at a point zq + ie lying just above the saddlepoint zq. 

The smaller one takes 8, the closer the path so traced will lie to the true path 
of steepest descent. It may be necessary occasionally to correct the approximation 
by solving the equation Im <i>(z) = 0, z = x + iy, using Newton's method or the 
secant method and varying x or y to bring the point z back to the true path. With 
Newton's method, for instance, either the x component is corrected by 

x< - x ~tm> IRe42|<IImH 

or the y component is corrected by 

until Im 4>(z) becomes acceptably small. The contours in Fig. 5-2, with appropriate 
modifications for the equipotentials, were constructed by this method. 
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In some problems the value of the integrand in (5-19) and (5-20) may decrease 
to zero very slowly both along the straight vertical path of Sec. 5.2.1 and along 
whatever path one proposes to use to approximate the path of steepest descent. 
Changes of the variable of integration that may speed the convergence of the numer- 
ical quadrature have been examined by Rice [Ric73]. Applications of saddlepoint 
integration to computing cumulative distributions of continuous random variables 
are to be found in [Hel83], [Hel84d], [Hel85a], [Hel86a], [Hel86b], [Hel92a], and 
[Hel92cj. The same technique can be applied to (5-4) to compute probability density 
functions if these are needed in order, for instance, to set a decision level under the 
Bayes criterion. 



5.2.3 integer-valued Random Variables 



In the detection of light, as we shall see in Chapter 12, the decision about the 
presence or absence of a signal is often based on the number n of photoelectrons 
emitted by a photosensitive surface onto which the signal, accompanied perhaps by 
random background light, is incident. The number n is a nonnegative integer-valued 
random variable described by a sequence po,p\ s p2> ... of probabilities 

Pk ~ Pr(n = k), k = 0, 1,2, ... . 

In certain nonparametric detectors, to be treated in Chapter 8, the decision is also 
based on an integer-valued random variable. In order to determine false-alarm and 
detection probabilities, one must be able to calculate the cumulative distribution 

ql = Pr(n < k) = (5-38) 



r=0 

of the random variable n, or its complement 

.(+> - 

ri(>7 d. K) ~ 

An analysis of the detector often yields the probability-generating function h(z) 
of the random variable «, which is defined by 

A(z) = £<*") = f>z* (5-40) 

[Hel91, pp. 124-5]. It resembles the z transform of the sequence {p k }, except that 
z k appears instead of z~ k . The probabilities pk can be recovered from h{z) by 
differentiation, 

" = h. & lhi2) L> 

although this is seldom useful for very large values of k unless it can be achieved 
analytically. Because 

4=0 
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h(\) - 1. The function h(z) has no singularities within or on the unit circle, for 

Derivatives of the probability-generating function h{z) at 2 = 1 provide the 
factorial moments 

00 

c m = - 1)(« - 2) ... (« - m + 1)3 = £ *(* - 1) ... (& - w + l)p k 
d m 

In particular, 

E{n) = cj = A'(l) (5-41) 

and 

c 2 = A"(l) = £(« 2 - «), 

with primes denoting differentiation with respect to 2. The variance of the random 
variable n is then 

Var ft = £(« 2 ) - [£(«)j 2 = A"(l) + A'(l) - [h'(l)f. (5-42) 

Let us express the probabilities p r by Cauchy's theorem as 

where C is a closed curve enclosing the origin of the complex z -plane, but none of 
the singularities of h(z). Substituting this into (5-38), we find 

If we take the curve C as a closed curve C_ surrounding the origin, but enclosing 
neither the point 2 = 1 nor any singularities of h{z\ then 



1 



by Cauchy's theorem, and 

Taking C, on the other hand, as a closed curve C+ including both 2 = and 2 = 1, 
but no singularities of h(z), we find that the integral corresponding to (5-43) yields 
1, and the complementary cumulative distribution of the number « becomes 

,-Jk, 



(+) _ , (-) _ v- f z" k h{z) dz ... 
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The integrands of (5-44) and (5-45) are convex U functions of x ~ Rez. The 
integrand of (5-45), for instance, is 

X -^T = E [T~l] = ^""*"'< l + + x ~ 2 + -H. * = R« > l. 

and the bracket is the sum of powers of x, aii of which have nonnegative second 
derivatives and are therefore convex U. A similar argument holds for the integrand 
of (5-44). The integrands of (5-44) and (5-45) therefore possess unique saddlepoints 
in the regions < Re z < 1 and 1 < Re z < £ lt respectively, where ei is the leftmost 
singularity of h{z), if any, on the positive Re z-axis. Saddlepoint integration should 
be considered as a way of evaluating the tail probabilities qi? and q™. 
The "phase" of the integrand is now 

4>(z) = In h(z) -k\nz- ln[±(z - 1)], (5-46) 

and its derivative is 

at the saddlepoint. If this equation cannot be solved analytically, Newton's method, 
described in Sec. 5.2.1, is most expeditious. Alternatively one can use the secant 
method to search for the root of 

Im <3>(x + is) = 

for a suitably small value of e. 

Before attempting to integrate (5-44) or (5-45) numerically, it is wise to plot 
paths of steepest descent for a few typical values of the parameters in the problem 
by the technique described at the end of Sec. 5.2.2. For the photoelectron-counting 
distributions to be studied in Chapter 12, it is found that these paths go off to 
infinity, and the closed contours in (5-44) and (5-45) can be deformed into osculatory 
parabolic contours through the saddlepoint without crossing any singularities of the 
integrand. Their curvatures are calculated by (5-35). The method of Sec. 5.2.2 can 
then be applied. 

For values of k near the expected value £(«), the saddlepoint lies close to the 
point z ~ 1, and the term ln[±(z - 1)] in (5-46) may cause the osculatory parabola 
to go off in the wrong direction when its curvature is calculated from (5-35), that 
is, in a direction that leads to divergence of the integral in (5-44) or (5-45). If that 
happens, one can drop that term in calculating the curvature— as we did in Sec. 5.2.2 
in using the modified phase 0(z)—~ or one can switch to the saddlepoint lying on 
the other side of the point z = 1, or one can integrate along a straight vertical path 
through the saddlepoint by setting the curvature k equal to 0. 

As a simple example, we consider the Poisson distribution with expected value 

m 

Its probability generating function is 
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Tabls 5-3 


Poisson Distribution (jj. - 20, 


e = 1<T 8 ) 




Numerical 


Number 


Saddiepoint 


k 


integration 


of steps 


approximation 




5 


0.169447S3(-4) 


8 


0.17145(-4) 




0.16944743(-4) 


16 






O.I6944743(-4) 


30 




1U 




o 
8 


ft ACitLQAf ^\ 

U.4yt)B4{— 2) 




0.49954122(-2) 


16 






0.49954122(-2) 


30 




ID 




in 


U.lUUyy 




O.I0486428 


20 






0.10486428 


38 




















20 


0.52975001 


7 


0.50880 




0.52974272 


[4 






0.52974272 


27 




30 


0.21818225(-I) 


7 


0.21764(~1) 




0.21818217<-1) 


14 






0.21818217(-1) 


27 




40 


0.53202024(-4) 


7 


0.53268(-4) 




0.53202024{-4) 


14 






0.53202024(-4) 


27 




50 


0.12458926(-7) 


7 


0.12477(-7) 




0.12458926{-7) 


14 






0.12458926(-7) 


27 





and the "phase" is 

<l>(z) = viz ~ 1) - k Inz - ln(z - 1). 
The saddiepoint z<j is the root of 

*'(z) = f x.-|_-i- T =0, 

which leads to a quadratic equation that the reader should write out and solve 
explicitly. 

In Table 5-3 we list the results of the numerical saddiepoint integration of (5-44) 
and (5-45) along a parabolic contour whose curvature was determined from (5-35). 
Even for an expected value \x as low as 20, good agreement with the exact cumulative 
probabilities is obtained with a few steps of numerical integration. Summing Poisson 
probabilities is a simple computation, of course, and one would turn to saddiepoint 
integration only when is so large that exp(-|i) underflows or when n is so large 
that the number of terms in summing (5-38) is excessive. 
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If the integer-valued random variable n takes on values only in a finite range 
< n < M, the probability-generating function will be a polynomial of degree M, 
and the integrands of (5-44) and (5-45) will have M zeros in the complex plane. The 
path of steepest descent from the saddlepoint will run into one of those zeros, and it 
is not possible to find a path of integration going out to infinity. The integrand may 
have numerous other saddlepoints in the complex plane, both above and below the 
Re z-axis. When E{ri) and k are large, however, it often happens that a path can 
be taken through the real-axis saddlepoint and that along that path the integrand 
drops to zero so rapidly that by using the stopping rule described in Sec. 5.2.2 
one can evaluate the probability or q^ with high accuracy. It is necessary to 
study each such application carefully, and no general criteria for the applicability of 
saddlepoint integration can be given. The method has been applied to distributions 
of nonnegative integer-valued random variables in papers such as [Hel84a], [Hel84b], 
[Hel84c], [Hel85b], [Hel87], and [Hel88a]. 

5.3 APPROXIMATIONS 

The sorts of computational methods we introduced in Sees. 5.1 and 5.2 may en- 
tail rather much programming, particularly when the moment-generating or the 
probability-generating function is complicated, and for a quick assessment of re- 
ceiver performance one often resorts to simply calculated approximations to false- 
alarm and detection probabilities. We shall concern ourselves here with those that 
are most accurate when the decision statistic G 1 is the sum of a large number M of 
terms as in (5-1). 

What is usually the simplest approximation consists of taking only the first 
term in the Gram-Charlier series (5-14), 

q + (G) = Pr(G' > <j) = £p(g) dg « erfc^^~ (5-49) 

where erfc(-) is the error-function integral defined in (1-11), G = £(G'), and a 2 = 
Var G 1 . This is known as the Gaussian approximation. The false-alarm probabilities 
go and the false-dismissal probabilities 1 - Qd we are mostly concerned with, how- 
ever, are quite small, and the decision level G is usually rather far in the right- or 
left-hand tail of the probability density function of the statistic G', P ( ■ ) or P\{ • ) as 
the case may be. As we have seen in Sec. 5.1, it is necessary then to take a large num- 
ber of terms in the Gram-Charlier or Edgeworth series in order to attain acceptable 
accuracy, and the first term (5-49) is likely to be a very poor approximation. 

5.3.1 The Chernoff Bound 

At times, and this is particularly true in theoretical studies of communication-system 
performance, it suffices merely to be sure that the number one has obtained lies above 
the true value of the tail probability, Q or Q\ = 1 - Qj. Such assurance is provided 
by the Chernoff bound. We shall derive it first for the right-hand tail probability 
q+(G) defined as in (5-13). 
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Denote the probability density function of the decision statistic G' by 
and let G be the decision level. Then the complementary cumulative distribution 
function of G' is 

P(g)dg = V{g-G)P{g)dg, 

G J-co 

where U( • ) is the unit step function, 

For any negative real value of x, 

U(g~G)<e- (s - G)x , x <0, 
as we can see by plotting both sides as functions of g. Hence 

100 
e -(s-G) Xp{g)dg = E^-iG'-G)^ = e Gx A(;t)) x < 0> (5 _ 50) 

-co 

where as in (5-2) A(-) is the moment-generating function of the statistic G. We 
should like to make the bound in (5-50) as tight as possible by choosing a value of 
x for which the right side is minimum. 

The function e Gx h(x) on the right side of (5-50) is convex U for values of 
x = Re z in the regularity domain c\ < x < c 2 of the Laplace transform in (5-2): 
The function exp[~x(G' - G)) is a convex function of x for all values of the random 
variable G', and as we learned in Sec. 5.2.1, the expected value of a convex function 
is convex. The expected value in the right side of (5-50) is therefore convex U. It 
possesses at most one minimum within the regularity domain. When G is sufficiently 
large, G > E{G'), this minimum will occur at a negative value of x] and this will be 
the case, for instance, for the small values of the false-alarm probability Q Q that we 
are ordinarily concerned with bounding. [If the minimum occurs at a positive value 
of x, we must put x = in (5-50), and we obtain the trivial bound q+(G) < L] 

Minimizing the right side of (5-50) is equivalent to minimizing In h{x) + xG. 
With derivatives denoted by primes, the value of x is given by 

8f + G = ' * s0 - (5 " 51) 

Call the solution of this equation xq. Then 

q + (G) < k(xo) exp Gx . (5-52) 

This result is called the Chernoff bound [Che62]. 

Suppose for example that the random variable G' has a Gaussian distribution 
with expected value zero and variance I. Then because the moment-generating 
function of G' is 

A(z) = 

we find from (5-51) that * = -G, whereupon we obtain the bound 
?+((?) = erfc G < e~ G2/2 , G > 0. 
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The left-hand tail probability is 

iC r<x> 
P(g)ttg = Hg)U(G-g)dg, 
-CO J — CO 

and by a similar argument we derive the bound 

q-(G) < h(x\) exp Gx\, x\ > 0, 

where x\ is the root of (5-51) lying to the right of the origin. The root x\ will in 
general be positive if G is sufficiently smaller than the expected value E(G'), as will 
be the case for false-dismissal probabilities Q\ = 1 - Qd of an order of magnitude 
of interest. 

Although the Chernoff bound has proved useful in information theory, it is 
usually somewhat larger than the exact value of the tail probability q+(G) or q-{G) 
and does not provide an accurate approximation to it. In the next part we shall 
derive an approximation for the tail probability that is closely related to that bound. 



5.3.2 The Saddlepoint Approximation 

In most detection problems we are concerned with false-alarm probabilities Qq on 
the order of 10~ 4 or less and with detection probabilities Q e i on the order of 0.99 
or more. The decision level G then lies far in either the right or the left tail of 
the density function of the statistic G ! , and approximate values of these probabilities 
with accuracy adequate for most engineering purposes can be determined with much 
less computation than is involved in the numerical integration described in Sec. 5.2. 
When the contour of integration passes through the saddlepoint z$, the integrands 
of (5-19) and (5-20) have nearly a Gaussian form, and by approximating them as 
such, the integrals are quickly evaluated. 

We write (5-22), which is the integrand of (5-19), as 

# (2 + iy) = exp[<I>(--o) - >"(=o)r] cxp r{y), 

where 

and we expand the second exponential function as 

expr(y) = 1 + r(y) + k[r(y)] 2 + 

Putting this into (5-19), collecting terms with identical powers of v, and integrating 
over -co < y < oo, we find that the terms with odd powers of y vanish. When the 
statistic G' has the form (5-1) with its components gj independent and identically 
distributed random variables, the function 

<3><z) = In h(z) + Gz - In z - M In t)(z) + Gz - In z 

contains a term proportional to the number M of inputs, and <3>(z) and its derivatives 
at 2 = zq can be said to be of order M. 
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After carrying out the integration of (5-19) term by term and keeping tlx 
in the result of order M~ ] , we obtain for the left-hand tail probability 



q-(G) * [2ir*"(2 )r I/2 exp <*>(z )[l + T x + -], < z < c 2 , 



with 



T x = 



&*(z ) 5[V"(z Q )f 



8[*"(z )P 24[<*>"(z )] 3 ' 



As in Sec. 5.2, zq is the solution of 




to AO) - - + G = 



lying to the right of the origin. Treating (5-20) in the same way, we fiml i 
right-hand tail probability the approximation 



with T\ given again by (5-54), but now the saddlepoint z is the solution . 
55) lying to the left of the origin. The terms omitted are on the order <>i 
Expressions for these terms are to be found in [Hel78]; they involve derivative. . 
phase <3>(z) of ever higher order. If the moment-generating or probability-gem i 
function is complicated, calculating these derivatives can be troublesome, :u» 
may have to resort to numerical differentiation, which is the less accurate, ifu- u 
the order of the derivative. Before attempting to calculate such terms, one sln-n 
the method of numerical integration presented in Sec. 5.2. The farther the \ 
G lies in the tail of the density function of the statistic G' and the larger lhe m> 
M, the more accurate are the approximations represented by (5-53) and (5- '■<■' 
An early application of this technique to finding asymptotic forms of ih. 
kel functions was made by Peter Debye [Deb09]. He attributed the methyl 
unpublished paper of Riemann's [Rie53]. Daniels [Dan54] described in ,son» 
the use of a saddlepoint method for approximating probability density I 'mm 
and probability mass functions, but did not extend the method to cumukui\< 
butions [Hel78]. The method can be applied to tail probabilities of iniega • 
random variables as well. One uses the phase $(z) defined in (5-46) in term . 
probability-generating function h{z) (5-40) of the integer-valued random v;n ui 
question. 

Table 5-4 lists values of the saddlepoint approximation to Marcum's Q inn 
for M = 50. Those in the column headed "First Order" were calculated by ( > 1 
(5-56); for the column headed "Zero Order," the correction term T\ was omii ini 
phase is given in (5-29), and the saddlepoint zo is determined by solvinj' i 
For D ~ 8 and D = 10 the decision level G is close to the expected value / i < 
M + \D 2 . In that neighborhood the saddle-point approximation loses ;ie, u 
and the Edgeworth series is more reliable. Values of the saddlepoint apprn\m 
in zero order for the Q function with M = 10 are listed in the last column m 
5-2. Again we find that the farther G lies in the tail of the distribution, i In- 
accurate this approximation becomes. 
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q + (G) « -[2irO"(z )r 1/2 exp<I>(2o)[l + Ti + ■••], 



C\ < Zq < 0, 



Table 5-4 Saddlepoint Approximation 
(Q Function: M = 50, G = 91.0634} 



D 


sadpt. 


Zero order 


First order 


Exact 







-0.46364 


9.99908 (-7) 


1.00019 (-6) 


9.99995 (-7) 


2 


-0.42730 


4.96812 (-6) 


4.97184 (-6) 


4.97052 (-6) 


4 


-0.33934 


2.03945 (-4) 


2.04549 (-4) 


2.04410 (-4) 


6 


-0.23064 


1.05470 (-2) 


1.06804 (-2) 


1.06496 (-2) 


8 


-0.12556 


1.86097 (-1) 


1.97196 (-1) 


1.94614 (-1) 


q-{G) 


10 


0.13020 


2.31365 (-1) 


2.39964 (-1) 


2.37915 (-1) 


12 


0.24202 


8.73300 (-3) 


8.77383 (-3) 


8.76320 (-3) 


14 


0.37318 


2.06676 (-5) 


2.06632 (-5) 


2.06603 (-5) 


16 


0.51148 


2.18325 (-9) 


2.18121 (-9) 


2.18116 (-9) 



3.3 Calculating Approximate Decision Levels for 
the Neyman-Pearson Criterion 

»- .;uldlcpoint approximation is particularly useful when seeking the decision level 
in allain a prescribed false-alarm probability Qq or an energy-to-noise ratio S 

■ ii Yields a prescribed false- dismissal probability Q\ = 1 - Q ct . Let us denote the 

niuNiiive distribution and its complement by <y_(x; S) and q+(x; 5), respectively. 

'■iiw usually Qq <k 1 and Q\ <K 1, these probabilities, #+(G; 0) and q~(G;S), 
i "viivciy, have the decision level G in the far right or the far left tail of the 

[ ii. .ihlc probability density function, where the saddlepoint approximation is most 

Wc first seek the value of G for which #+(G; 0) equals the preassigned false- 
'i in probability go- Putting \nh(z) - iji(z) into (5-56), we write the saddlepoint 
i'i'>\imation as 

g + {G; 0) » \z\~ l [2-n^"(z)}- l/2 expf^z) + Gz], (5-57) 
ii i In- saddlepoint z determined by 

0'(z) = »j/(z) + G -z- ] = 0. 
mi' i his for G, we replace G in (5-57) by 

G = z" x - i|/(z), (5-58) 

■ i nil- ( v57) as 

In Q = ^(z) + I - z\\i'{z) - In \z\ - \ ln[2-ir<I>"(z)] (5-59) 
*"(z) = V'(z) + z- 2 . 
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Table 5-5 Marcum's Q Function (Q = 10"**, Qd ~ 0.999) 



Saddlepoint Exact 



M 


G 


S 


G 


S 


10 


32.7180 


5J.3098 


32.7103 


51.2928 


20 


48.8296 


61.7248 


48.8265 


61.7187 


50 


91.0632 


82.3819 


91.0634 


82.3866 


100 


154.9172 


105.5961 


154.9190 


105.6082 


200 


274.5543 


138.3292 


274.5576 


138.3492 


500 


613.5707 


203.0855 


633.5762 


203.1183 


1000 


1157.5704 


275.9176 


1157.5779 


275.9635 



One solves (5-59) for z by the secant method or any other convenient m. 
Many calculators and mathematical software programs have routines thai dis- 
solve equations of this kind. One can take the initial value of z just inside ilu- ■ 
c\ < Re z < 0. One then determines the approximate value of the decision I. 
from (5-58). 

A similar method can sometimes be used to determine the total energy i» > 
ratio S required to attain a prescribed probability Qj of detection. When the u 
G' is governed under hypothesis H\ by Marcum's Q function, for instance, «. 
solve (5-30) for 5 as a function of z, obtaining 

S(z) - (G - z~ l ){\ + zf - M(l + z). 

The saddlepoint approximation (5-53) for the false-dismissal probability yid.f 

m(l - Q d ) m <t>(z) - \ ln[2irO"( z )], 

with <p(z) and 4>"(z) given by (5-29) and (5-3 1), respectively. Into these we si d - 
S(z) from (5-60), obtaining the equation 

W - Qd) ~ Gz - M ln(l + z) - - In z - i ln(2ir) 

1 + z * 

- \ ln[M(l + z)~ z + 2S(z)(\ + z) ' < 

This equation, with (5-60), is solved for the saddlepoint z > 0, either by tin 
method or by a computer root-finding algorithm. The result is substituted ins* > > 
to obtain an approximation to the energy-to-noise ratio S. Table 5-5 lists \:iin 
the decision level G and the energy-to-noise ratio 5" as calculated by this m. :: 
along with their exact values. The saddlepoint approximation yields accurate i - 
even for small values of M . 

If greater accuracy is desired, the approximations for G and S as deriw .! 
can be used as starting values in the solution of 

In 0) = In &, 

In ?_((?; S) = m(l-0 rf ), 

by the secant method, in which the quantities on the left side are com pi n. 
saddlepoint integration as in Sec. 5.2 or by the recursive algorithms in Apyvin: 
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! 1 iiis method was used to compute the curves in Fig. 4-9 and the exact values 
i.ihk- >5. 

_?.4 Calculating Decision Levels for the 
Bayes Criterion 



■'■n i he receiver of a binary communication system bases its decisions on the 
!' -in- (/', it must form the likelihood ratio A(G) = P\{G)/P (G) for comparison 
ii ihc decision level A of (1-17). As shown by Daniels [Dan54], the density 
■ 1 1. mis of G\ if unknown in closed form, can be calculated by a similar saddlepoint 
■i'«Minaiion. Applying it to the contour integral in (5-4), we find the zero-order 
'"Aimntion 

P(G) = [2^"(z Q )T l/2 Uz Q ) 

= [2irxf(z )Y l/2 h(J )exp Gz , 

ty(z) = \nh(z) + Gz, 
Kz ) 

>■ ii G lies in the right tail, the root of (5-61) must be negative, z < 0; for G in 
i' li mil, it is z\ > 0. The equation to be solved here is the same as that involved 
in t 'liernoff bound — see (5-51). 

\ decision level G on the statistic G' is normally in the right tail of Pq(-) and 
!»■ lert tail of P\(-). Thus the likelihood ratio for G is approximately 

ii < is the saddlepoint for Pq(G) and z i > is the saddlepoint for P](G); 
I tscripts now refer to the hypotheses Hq and H\. From (5-62) we find an 
■ion for the decision level: 



■o 



(5-63) 



1 1 1 1 ,!' with a trial value of the decision level G , conveniently taken halfway between 
■ l«viud values E(G'\ H Q ) and E(G'\ Hi) of G' under the two hypotheses, one 

mI.iIcs zq < and z\ > from (5-61), substitutes them into the right side of 
' i" obtain a new value of G, and repeats the process until the decision level 

■ ii.-rs. Because the error probability P e = IqQq + ^(1 - Q d ) is insensitive to the 

1 I' nation of the decision level G, the approximate value of G resulting from 

-;i>m of (5-63) will be adequate for most purposes. 

t.5 The Uniform Asymptotic Expansion 



: Mlrpoint approximation for the tail probabilities cj+(G) and g-(G) that is uni- 
iK'curate over the entire range of values of G, even in the neighborhood of 
■j'iried value E(G'), has been described by Rice [Ric68] and Lugannani and 
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Rice [Lug80]. It takes as the saddlepoint the root z~q of (5-61), treating the \;u \ .= 
in the integrands of (5-19) and (5-20) in a different manner from ours. Tin i 
of lowest order in this uniform asymptotic approximation are, for G > I-'m, > 
z <0, 

q + {G) w erfc[-2iKF )] i/2 + A Q - Bo, 
\\s(z) = lnh(z) + Gz, 
A = Izol-'P^zo^-'^exp^zo), 
B = i[-mK?o)r ,/2 exp i(/(z ). 

When G » E(G'), the first term in (5-64) nearly cancels the term Bo, Icaun.- 
term Aq, which is close to the zero-order saddlepoint approximation in (5-5M u 
the decision level G is near to E(G')> on the other hand, the terms Aq ami 
nearly equal, and the first term is roughly the same as the Gaussian appro\ur. 
in (5-49). 

For G < E(G') and z Q > 0, the left-hand tail probability q-(G) is 
mated by formulas of the same form. The terms of higher order are rathn ■■■ 
complicated and can be found in the paper of Lugannani and Rice [LugS(i| i 
the terms of higher order in the saddlepoint approximation (5-53) and (5-^'»i 
involve derivatives of the moment-generating function. This paper also examm. 
question of the existence of the saddlepoint calculated from (5-61) for varion > 
of density function P(-). 

Problems 

5-1. Use the contour integral for Afth-order Q function Qm(S, y) defined in (C-I'M i- 
that 

•qs "M+\{b,y)> 

where Pu+\(S,y) = -9Q w+i (5, y)/dy is the associated probability density I'm.. •■ 
5-2. Show by the Chernoff method of Sec. 5.3.1 that 

r=k 



By Stirling's formula 



and we then find 



*!*(2irA) ,/2 fc* + J- + ..-], 

J L e -y s (2irk ) l/2 l~ e~y. 

5-3. Determine, to zero order only, the saddlepoint approximation to erfc G, G > i> 
uate it for G = 1 to G ~ 10 in unit steps, and compare the results wiih ui 
values. Calculate erfc G for the same values of G by numerical contour ink?" ''' 
(5-24), taking a straight vertical path through the saddlepoint. Why cannot a |m> 
contour be used in this computation? 

5-4. For each of the Swerling cases of amplitude distributions of fading signals 
in Sec. 4.2.4, determine the saddlepoint approximation for the average proh;ilnh 
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nl' detection, working out only the zero-order approximation as in (5-53) and (5-56). 
Assume that the noise is white and Gaussian and that the receiver sums the outputs of 
;i quadratic rectifier following a filter matched to the signal, as in (4-!9). 
( :i!culate the zero-order saddlepoint approximation for the values of the Poisson prob- 
ability distribution listed in Table 5-3. 

Calculate zero-order saddlepoint approximations to the cumulative binomial distribu- 
lion and its complement by applying the technique of Sec. 5.3.2 to the integrals in 
(5-44) and (5-45). 

I isc the saddlepoint approximation to calculate the energy-to-noise ratio plotted in 
l-'ig. 4-6 for Rayleigh-fading signals as a function of the number M of inputs. Using 
i he method of Sec. 5.3.3, first determine the decision level £/ required for a preassigned 
lalse-alarm probability by the saddlepoint approximation to (4-30), and then use the 
s;ime type of approximation to (4-31) to calculate the value of S = Ads 2 for a preassigned 
,i\ L't age detection probability Q d , Plot your approximation to the energy-to-noise ratio 
s versus M for 50 < M < 1000, Q Q = 10~ 6 , and Q d = 0.9999, and compare with the 
k in ves in Fig. 4-6. 

Work out a saddlepoint approximation to the probability £>rf(m ) of detection in (4-87), 
mking 1 - Q d 1, M » 1, 
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6 



Estimation of 
Signal Parameters 



6.1 THE THEORY OF ESTIMATION 

A radar has more to do than detect targets; it must find where they are and how 
they are moving. For this purpose it must estimate the values of certain parameters 
of the received echo signals, and because of the noise the estimates will be in error. 
The theory of estimation shows us how to design the receiver to minimize errors due 
to noise, and it tells us how large the irreducible residual errors will on the average 
be. 

Locating a target requires specifying its distance and its direction. The distance 
is proportional to the interval between transmission of a radar pulse and reception 
of its echo, and to measure it the radar must determine the instant when the echo 
arrives. It might do so by timing the peak of ' the received signal, but exactly when 
this peak occurs is made uncertain by the noise. The azimuth of the target can be 
estimated by comparing the amplitudes of successive echoes as the radar antenna 
rotates. By changing those amplitudes in a random, unpredictable way the noise 
introduces error into that estimate. A measurement of the Doppler-shifted carrier 
frequency of the echo yields the component of the target velocity in the direction of 
the radar; this too will be falsified by the noise. 

A radar echo can be represented in the form 

s(t;A t 4», t, fl) = A Re F(t - t) exp(/fl* + iijj), 

where F(t) is the complex envelope of the pulse, A its amplitude, t its time of arrival 
or epoch, CI its carrier frequency, and the phase of its carrier. The input to the 
receiver is 

v{t) - s(t; A, t, ft) + n{t), 0<t <T, 
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with n(t) the random noise; and from the input v{t) observed during the interval 
(0, 7") the receiver is to determine the values of the unknown parameters A, t, and 
fl. The joint probability density function of set v = {v(tj)} of samples of the input 
at times tj depends on those parameters through the signal, 



where po({tii}) is the joint probability density function of samples = w(r,) of the 
random noise at times In this way the unknowns become parameters of the 
distribution of the observations v. 

The phase \\t carries no useful information about the target when, as often, 
the phase of the transmitted pulse is uncontrolled. Over such an uninformative 
parameter the joint probability density function may be averaged with respect to an 
accepted prior distribution z(-), 



to provide a joint density function of the observations that depends only on the 
quantities of interest. We seek values of A, t, and Cl for which the joint -probability 
density function p(v; A, t, O) in some sense best describes the observations v. Al- 
ternatively, we may estimate the phase and other such "nuisance parameters" and 
discard the results. 

Signal parameters may also need to be estimated in communications. Suppose 
that we wish to transmit numerical data that can take on arbitrary values within 
a limited range, as in telemetry when the temperature or pressure at a point is to 
be conveyed periodically to a distant observer. The amplitudes of a succession of 
pulses might be set by the data, or their carrier frequencies might be caused to deviate 
proportionally from a reference frequency stored at transmitter and receiver. The 
receiver must then estimate the amplitudes or the frequencies as the pulses arrive 
mixed with random noise, and it should be able to do so as accurately as possible. 

The theory of estimation presupposes that one knows the joint probability 
density function p(v\ 0) of the outcomes v of a set of measurements as a function 
of m unknown parameters = (0i, 62, ... , 8,„). These are called the estimanda. 
They can be represented as a point in an m-dimensional parameter space 0. If, for 
instance, are the parameters of a signal s{t\ 0) received in the midst of random 
noise, p(v\ 6) will derive from the joint probability density function of samples of 
the noise. 

One seeks a strategy that on the basis of measured values of the n random 
variables v\, vj, ... , v„ assigns some value Bk(v) as an estimate of the /cth parameter 
0/t; this function 9/ ( (u) of the data v is called an estimator. The number that results 
when a particular set v of data is substituted into this function is called an estimate 
of the parameter 8/ f . The collection of m such estimators is designated by the vector 



Like the vector of true values of the estimanda, it is represented by a point in the 
parameter space ©. 

Because the data are random variables, no two experiments will yield the same 
values of the estimates 0, even though the true set of parameters is the same for 



p(v\ A, (]f, T, fl) = p ({Vi - s 



{t,;A,ty, T ,ft)}), 




6(d) = (ei(w), e 2 (p), ...,e,»). 
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both. The most one can hope for is that the estimates % will be close to the true 
values Bk "on the average." Given a set of strategies or estimators Q(v) and the condi- 
tional probability density function p(v\ 0) f one can calculate a conditional probability 
density function q$\ 0) of the estimates 0. One would like this conditional density 
function to be sharply peaked in the neighborhood of the true values 6. 

If the expected value 8k = E$k(p)\ 0] of the estimator Q\(v) equals the true 
value Q k , the estimator is said to be unbiased. The difference — 6* is called the 
bias of the estimator. The mean-square error incurred by the estimator is defined as 

E = E{[h(v)-Q k ] 2 } 

and it can be written as 

£ = E\fi k (v) - I* - %f] = E{[Q k (v) - Ikf} + (5* - B*) 2 

= Vare fc (u) + (e k -e t ) 2 . 

The mean-square error therefore equals the sum of the variance Var 0*(i>) of the 
estimator and the square of its bias. Both the bias and the variance of an estimator 
should be small, and it is often necessary to compromise between these desiderata. 



€.1.1 Maximum-a-posteriori-probabH'rty 
Estimators 



Imagine the space © of the estimanda divided into a large number M of small 
regions A,-. Denote the center of the yth region A/ by the /M-vector 0,-, and denote 
by Hj the proposition "The parameter set lies in region Ay." Consider a strategy 
whereby the receiver chooses among the M hypotheses Hj on the basis of the ob- 
served data v = (v\, v 2 , ... , v„). When the receiver selects hypothesis H i} it issues the 
estimate Ofa) - 0,. The simplest strategy, as we saw in Sec. l.l, directs the receiver 
to choose that hypothesis H, whose conditional probability Pr(/T,-| v) is largest; this 
is Bayes's rule. Here 

?r(H,\v) = £/>(e|»)<re f (6-i) 

where />(0l v) is the conditional probability density function of the parameter 0, 
given the data v. By Bayes's theorem for continuous random variables this is 

, ( e„) = 

with 

p(v) = f z(Q)p(v\Q)d w Q (6-3) 
Je 

the overall probability density function of the data v [Hel9l, p. 157], [Pap9l, p. 164]. 

If now the regions A,- become smaller and smaller, the number M of hypotheses 
increasing, the probability Pr(i/,| v) becomes proportional to the conditional prob- 
ability density function p(Qj\ v) at the center 0/ of region A;, and from this point 
of view the best strategy for the receiver is to issue as its estimate that set for 
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which the conditional density function p(Q\ v) is maximum. That is, the optimum 
estimator B(v) is defined by the criterion 

p(Q(v)\v) >p(Q\v), ve e e. 

This is called the maximum-a-posteriori-probability (MAP) estimator. 

The joint probability density function q(B\ 0) of the set Q(v) = (Q\{v), ...,Q m (v)) 
of estimators of the m unknown parameters, when their true values are 8, can be 
expressed as 

g(Q'\ 0) = f 5(0' - to(v))p{v\ 6) d n v, (6-4) 
J ft 

where 6( ■ ) is an m-dimensional delta function. We attach a prime (') to the set of 
algebraic variables figuring in the density function in order to distinguish them from 
the set of estimators Q(v) of the m parameters 0. Just as Bayes's rule maximizes 
the overall probability of correct decision in hypothesis testing, the MAP estimator 
maximizes the average value 

q= f z(e)4(0jeK"e. 

J© 

The posterior density function q(Q'\ 0) of the estimate is to be heaped up as high as 
possible at the true value 0, at least on the average. By (6-2) and (6-4) we can write 
this average as 

Q = f p(v)p(Q(v)\ v) d"v, 

and because p(v) > 0, this will be largest if to each point v of the data space R„ we 
assign as Q(v) that set for which p(Q\v) is maximum. 

When the estimanda are parameters of a signal s(t; 0), we can introduce 
the joint probability density function po(v) of the data in the absence of any signal, 
divide it into both numerator and denominator of (6-2), 

p(v)/p (v) 

and pass to the limit of an infinite number of samples v of the input v(t) of the 
receiver, whereupon the conditional probability density function of the estimanda 
becomes the functional 

, ( e|,( ) = il^M, 

AW03 (6 . 5) 

A[v(t)] = z(0)AKO|0K"0, 
J® 

where A[v(t)\ 0] is the likelihood functional for detecting the signal s(t; 0) in noise. 
Because the denominator A[y(/)] does not depend on the parameters 0, it suffices 
to find those values of the parameters for which the numerator z($)A[v(t)\ 8] is 
maximum. 

6,1.2 Maximum-likelihood Estimator 

The MAP estimator, by (6-2), is given by the vector © for which the product 
z(Q)p(v\ 0) is maximum. In many problems it is difficult or impossible to spec- 
ify a precise prior probability density function z(8) for the estimanda 0, but usually 
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whatever prior density function z(6) is reasonable will be rather broader as a func- 
tion of 8 than the density function p(v\ 0) of the data. The strictly MAP estimate 
then lies close to the vector 6 for which the probability density function p(v\ 0) of 
the data is maximum. Unless this is the case, one's measurement of the data v is 
providing little information to improve one's knowledge of as embodied in the 
prior density function z(0). It is customary, therefore, to assume that the prior 
probability density function z(d) is so broad that its effect on the maximization of 
the product z(Q)p(v\ 0) is negligible. The values of for which p{v\ 0) is maximum 
are termed the maximum-likelihood estimates of 0: 

p{v\ kv))Zp(v\Q), ve e ©. 

This inequality defines the maximum-likelihood estimator. One can say that the 
maximum-likelihood estimates are those values of the parameters for which p(v\ 0) 
best fits the observed data v. It is this estimator that we shall principally utilize in 
estimates of the arrival time and carrier frequency of a signal. 

In most estimation problems it is desirable that the same value of a parameter 
8 be obtained whether one estimates itself or some monotone function /(6) of 
that parameter. For instance, if one estimates 6 3 and takes the cube root of the 
result, one would like to find the same value that a direct estimate of would 
yield. Maximum-likelihood estimates possess this property. Maximum-a-posteriori- 
probability estimates, however, based on some prior density function z(0), in general 
do not, because of the different weightings assigned to corresponding ranges of the 
parameter. For a discussion of such matters as applied to physical measurements, 
the article by Annis, Cheston, and Primakoff [Ann53] and the books by Jeffreys 
[Jef73], [JefS3] and Janossy [Jan65] may be consulted. 

When the estimate of parameters of a signal s(t) = s(t; 0) is to be based on 
the input v(t) to a receiver, we must pass to the limit of an infinitely dense sampling. 
This we can again achieve by dividing p(v] 0) by an appropriate probability density 
function po(v) of the data that is independent of the parameters 8, usually that of 
samples of pure noise: 

lim « = A[v(t)l 6]. 

Then the maximum-likelihood estimator B[v(t)] is that set of parameter values for 
which the likelihood functional A[v(t)\ 0] for detecting the signal s(t; 0) is as large 
as possible. There may be many values of certain of the parameters for which 
this functional A[i?(f)| 0] possesses local maxima, and the highest of those maxima 
identifies the maximum-likelihood estimate. This concept will be developed in the 
following section. 

6.1.3 Estimating the Mean of a Gaussian Distribution 

As a simple example to illustrate these ideas, let us estimate the mean or expected 
value m of a Gaussian distribution with known variance 8 2 by n independent obser- 
vations z>i, z>2, ... , v n of a random variable v. Their joint probability density function 
is 



224 



Estimation of Signal Parameters Chap. 6 



p(v\m) = {2irB 2 r ,/2 exp 



2h 2 



(6-6) 



Let us assume that previous experiments have shown the true mean m to be normally 
distributed with expected value (i, and variance [3 2 , 



z(m) = 



(6-7) 



From the definition (6-2) we can with some labor calculate the posterior probability 
density function of the true mean m, given the set of outcomes v: 



p{m\ v) ~ 
1 n 



1 



1 



5 2 (32 



(6-8) 



S~ h 2 + 

where X is the. sample mean of the data, 



x = ~ y v k . 



(6-9) 



k=\ 



The value of m at which the posterior density function p(m\ v) is maximum is the 
MAP estimator 



m(v) 



v 52 



+ 4 

P 2 



- $ 2x + ^ s2/ " 

P 2 + (8 2 /«) 



(6-10) 



As always when the distributions of the observations and of the parameters are 
Gaussian functions of both the observations and the parameters, this estimator is 
a linear function of the data v. Because this estimator is a function of the data v 
only through the sample mean X, m ~ m(X), X is said to be a sufficient statistic 
for estimating the mean m. 

Upon examining the estimate given in (6-10), we see that if the initial uncer- 
tainty £ in the value of the mean m is very large, £ 2 » 5 2 /«> the MAP estimate 
m is nearly equal to the sample mean, m « X . Indeed, the sample mean X is the 
maximum-likelihood estimator of m, as we can show by differentiating (6-6) with 
respect to m and setting the result equal to 0. If, on the other hand, the error 
variance of the measurements is very large, 8 2 /« ^> p 2 , the estimate in is close to 
the prior expected value jx, and the observations contribute little new information 
about the value of the mean m. 

The estimator m in (6-10) is biased, for its expected value is 



E(m\ m) - (m) = 



(6-10 



(3 2 + (6 2 /*) 

when the true value of the mean is in. As the number n of measurements increases, 
this expected value (m) approaches the true mean m, and we can say that the esti- 
mator is asymptotically unbiased. 
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Because the variance of the sample mean X equals 8 2 /«, the variance of the 
MAP estimator is 

and the reader should verify that the mean-square error E[m(v) - m) 2 ] equals the 
sum of this variance and the square of the bias. 

6*1.4 Jointly Gaussian Parameters and Data 

We can generalize the problem treated in Sec. 6.1.3 by considering how to estimate 
m correlated Gaussian random variables 81, 82, ... , 0« by observing a number n 
of random variables v\, Vi, ... , v„. These "data" are themselves Gaussian random 
variables and are correlated with the estiraanda 81, 82, ... , 8 m in such a way that the 
joint probability density function of all n + m random variables has the multivariate 
Gaussian form. An example is the estimation of a correlated discrete-time Gaussian 
random process corrupted by additive Gaussian noise. 

Without loss of generality we can assume that all these variables have expected 
values equal to 0, for otherwise we could define new variables by subtracting out 
their known expected values. 

In order to write out the joint probability density function p{v, 8) of all the vari- 
ables, we introduce the (« + m)-element column vector whose transpose is {v T Q T ) t 
where v is the w-element column vector of the data and is the /w-element column 
vector of the estimanda. Then 

p(v, 8) = p(v u ...>?„, 6j 6 m ) = Ci expj^^V)^ J ) j, (6-12) 

where C 5 is a normalization constant and the (« + m) x (« + m) matrix \l is the 
inverse of the covariance matrix § of data and estimanda, 

L Wv <P8B J 

Here 4h* - E(w T ) is the n x n covariance matrix of the data v, <J>ee = E(QQ T ) 
is the m x m covariance matrix of the estimanda 0, and <j>e P = <f>J> = E(Qv T ) is 
the m x ft cross-covariance matrix of the estimanda and the data. The conditional 
density function of the parameters 8, given the data v, is now 

= « (6-U) 

where 

p{v) = Co exp(4w r <f»», (6-15) 

with Co another normalization constant, is the joint probability density function of 
the data. 

The matrix ft = ifr 1 in (6-12), like that in (6-13), can be expressed in block 

form, 

HE El (6 " 16) 
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where 

{*>&> = Ja-ub - -M-eo^ei^TO , (6-17) 

M-ee = (<f>ee - 4>er<l>ro ^ce) -1 . 
as can be shown by multiplying (6-13) by (6-16) to obtain the (« + m) X (n + m) 
identity matrix. 

By dividing (6-12) by (6-15), we find for the conditional density function 
p(B\ v) = (C,/C ) exp[-\v T tivuv - ^V^O - ^Voz^ - ^Vee^ + i» T (|>>] 
= exp[4(8 - Mw/imoCe - Mx>)], (6-18) 

where the m x n matrix M is still to be determined. We find it by writing out the 
arguments of the two exponential functions and comparing terms, and we obtain 
for one of those terms 

whence M- 6e M = -(x QfJ , and hence by (6-17) the matrix M must be 

M - -p^jMto = 4> B ^. (6-19) 

The MAP estimators of the parameters — those values that maximize the 
conditional density function p(Q\ v) in (6-18)— are linear functions of the data v, 

6(y) = M^ = <t> eiI <K> (6-20) 

The estimation matrix M is the solution of a set of linear equations written in matrix 
form as 

M<f» w = <K>- (6-21) 
The m x m covariance matrix of the errors is furthermore 

B = £-[(6 - Mp)(6 - Mi;) 7 ] = ^ - <f> 6B - fav^l = <i>eo - M«frJ, (6-22) 
by (6-17) and (6-18). 

6.1.5 Bayes Estimates 

Sometimes it is possible to specify both the cost of an error in estimation and the 
joint prior probability density function of the estimanda, much as in hypothesis 
testing the costs of incorrect decisions and the prior probabilities of the hypotheses 
may be available. The cost to the experimenter of assigning a set of estimates 8 
to the parameters when their true values are given by 6 ~ (6i, 02, ... , 9 m ) will be 
a function C(8, 8) of both the true values and the estimates. As a function of the 
estimates, it is smallest for = 0, often depending only on the differences (8* - 8*) 
between the estimates and the true values of the parameters. Given both a prior 
probability density function z(8) of the estimanda and a cost function C(8, 8), the 
observer is in a position to adopt the Bayes criterion that the estimation strategy 
should yield the minimum average cost per experiment. 
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The optimum Bayes strategy for estimation can be derived in much the same 
way as for hypothesis testing, and as in Sec. 1.1.2 we find that it requires us to 
determine those values 8 of the estimanda 6 for which the conditional risk 

c(e| v) = f c(e, 0)^(61 t>) <re (6-23) 

J& 

is minimum. The Bayes cost associated with any estimator Q(v) is 

J© J ft, 

C(Q(v)\ v)p(v) d"v; 



(6-24) 



ft, 



we have used (6-2). This Bayes cost will be smallest if to each point v in the data 
space H„ we assign the vector 6 that minimizes the conditional risk in (6-23). 
The Bayes cost can be written 

C = f z(B)R[e(v)\ 8] d m §, (6-25) 

J& 

in terms of the risk associated with an arbitrary estimator 6(u) of the parameters, 
when their true values are 0: 

R[Q(v)\ 0] = [ C(Q(v), B)p(v\ 6) d n v. (6-26) 

When the estimanda are parameters of a signal s(t; 8), their Bayes estimator 
is that set 8 = Q[v(t)] for which 

C[8| v(t)] = f C(8, 8)p(8| v(t)) d">6 
J® 

is minimum, with p(Q\ v(t)) defined as in (6-5) in terms of the likelihood functional 
A[v(t)\ 8] for detecting that signal in noise. 

6.1.6 The Quadratic Cost Function 

Because the mean-square error is one of the principal measures of the quality of an 
estimator, it is appropriate to consider the quadratic cost function 

C(G, 6) = (6 - 6) 2 , (6-27) 

which because of its mathematical simplicity is indeed the most frequently adopted. 
The resulting estimator Q(v) is called the minimum-mean-square-error (MMSE) esti- 
mator. 

The conditional risk in (6-23) is then 

C$\v). = C (Q-e) 2 p(d\v)dQ 

J— 00 

/■CO <-00 

= 8 2 - 26 Bp(B\ v)dQ+ 8 2 p(0| v) d%, 

J— 00 J— CO 
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and it is minimum when 

i-CO 

6(77) = 8^(6| v) dQ = E(d\ v). (6-28) 

J— oa 

The Bayes estimator is now the conditional expected value of the parameter 8, given 
the data v. 

The minimum conditional risk equals the conditional variance of the parameter 
G, given the data v: 

min C(Q(v)\ v) - Var(9| v). 

Multiplying by the overall density function p(v) of the data, as defined in (6-3), we 
find the minimum Bayes cost to be 

C min = f Vat(Q\v)p(v)<f>v = £ w [Var(e| v)]. (6-29) 

When E{§\ v) is a linear function of the data v, as happens when the joint probability 
density function p(6, v) = z(&)p(v\ 6) of the parameter 8 and the data v is Gaussian 
in both 6 and v, Var(0| v) is independent of v, and the minimum Bayes cost C min is 
simply the conditional variance Var(9| v) itself. Although this is now independent of 
the data v, it is in general smaller than the prior variance Var 6, which is determined 
entirely by the prior density function z(G). 

In our example in Sec. 6.1.3 of estimating the mean m of a Gaussian random 
variable from n observations, we see by (6-8) that the estimator | in (6-10) is the 
conditional expected value of the estimandum m, and it therefore serves not only as 
the MAP, but also as the MMSE estimator of m. The conditional variance Var(/w| v) 
of the mean m, given the data v, is by (6-8) 

Var(m! v) = s 2 = -^L-, (6-30) 

■and it is independent of v. As in (6-29), this must then equal the minimum Bayes 
cost, 

r - 52 
m,n n (32 + (8Vn)' 
The risk associated with a true mean m is, according to the definition in (6-26), 



R[m(v)\ m] = {m(v) ~ mfp(v\ m) d"v = [m(X) - mfp(X\ m) dX 

JFt,, J- co 

m) 2 b 2 /n ] 
5V«)F J' 



5! 
n 



p 4 + (tfc ,^ 2 R 2/„1 (6-31) 
[|32 + (8= 

where />(A"j m) is the probability density function of the sample mean X when the 
true mean of the data is m. By using (6-7) and (6-25) the reader should verify the 
minimum Bayes cost just derived. 

When several parameters 6 are to be estimated, what corresponds to the 
quadratic cost function is the positive-definite quadratic form 

m m 

c& e) = £ £ #,■,(§,■ - e,-)(e; - e,-). (6-32) 
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The Bayes estimates are again the conditional expected values of the parameters, 
given the data v. 

- f QkPQl v) d m Q, k = 1, 2, ... , m. (6-33) 
J© 

This estimator is the same as though the cost of an error 9,t(i>) - were simply its 
square, and the total cost were the sum 

m 

C(M) = 6a) 2 . (6-34) 

When as in Sec. 6.1.4 the data v and the estimanda are jointly Gaussian 
random variables, we can see from (6-18) that the conditional expected values of the 
estimanda form the column vector 

h(v) = Mv 

with the m x n matrix M given in (6-19). These therefore serve as the minimum- 
mean-square-error estimators of the parameters 6 in the joint probability density 
function in (6-12). They constitute linear estimators of the parameters because they 
depend linearly on the data v. 

6. f .7 The Principle of Orthogonality 

In many estimation problems one cannot be sure that the data and the estimanda 
are jointly Gaussian distributed, but one chooses to adopt a linear estimator of the 
form = Mv anyhow. For one thing, linear operations are simplest to carry out by 
both digital and analog means. One then seeks the estimation matrix M for which 
the total mean-square error (6-34) is minimum. It must be the same as the optimum 
matrix M when the data and the estimanda are really jointly Gaussian distributed, 
and it is given by (6-20) and (6-21). 

A quick way of. deriving the MMSE linear estimator is based on the principle 
of orthogonality [Hel91, pp. 495-500], [Pap91, p. 204]. Two random variables A 
and B are said to be orthogonal if E(AB) = 0. Two «-tuples of random variables 
v = (v\, V2, ... , v„) T and w = (wi, wi, ... , w„) T are orthogonal if each element of 
the n X n matrix E(vy/ t ) equals zero. The "space" of these random variables then 
takes on the properties of a metric space in which the squared "distance" between 
two such vectors is specified by the expected value 

E[(v - w) 2 ] = E[(v T - v/ T )(v - w)] 

of their squared difference. 

In our problem of estimating the m random variables 9 = (8], 62, ... , B m ) T , 
the data v are thought of as spanning a subspace Vof the metric space. The MMSE 
linear estimator 6 = Mv, as a linear combination of the data v, also lies in that 
subspace V. Because the average cost, measured by the squared distance between 6 
and the vector 8 of true values, is to be minimum, the error vector 0-6 must be 
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as short as possible and hence perpendicular to the subspace V; that is, it must be 
orthogonal to all vectors v in V, 

E [(0 - 0> r ] = 0. 

Thus 

£(0x/) = £(Mw r ) = M£(w r ) 

or 

as in (6-21). 

The covariance matrix B of the errors is 

B = £[(0 - 0)(0 T - 7 )] = £[(0 - 0)0 r ] - £[(0 - 0)0 r ]. 

Because the estimator 0(z>) lies in the subspace V and the error - is orthogonal 
to it, the second term vanishes, and we find from the first 

B = £(00 r ) ~ £(Mfo0 r ) = <f> 00 - M4>£, 

as in (6-22). 

The statistical model that most commonly gives rise to an estimation problem 
of this kind is one in which the data v result from some known linear operation on 
the unknowns 9i, 02, ... , 6,„, with the addition of independently random errors, 

m 

Vj = + e i9 1 <j < w, 

i=i 

or, in matrix form, 

v - K0 + e, (6-35) 

where e is a column vector of the « errors or "noise" variates, which are independent 
of the estimanda and usually assumed to have equal variances Var e-, = a 2 ; K is a 
known n X m matrix. Then 

<f» to = £[0(K0 + ef] = 4> m K T , 

<f> w = £[(K0 + e)(0 r K r + e T )} = K<f> 68 K r + ct 2 I, 

with I the n x n identity matrix. The equations to be solved for the m x n estimation 
matrix M are then, from (6-21), 

MK4> 8 eK r + <x 2 M = <|>eeK r . (6-36) 

The linear filtering and prediction of discrete-time stochastic processes can 
be treated in this framework and has given rise to a vast literature. Elementary 
introductions can be found in such works as [Lar79, vol. 2, pp. 112-31], [Hel91, 
pp. 507-39], [Pap9l, pp. 486-93, 512-28], [Sch91a, pp. 323-33, 423-78], to name only 
a few. Continuous-time stochastic processes can be linearly filtered and predicted by 
analogous techniques, which were first developed by Norbert Wiener [Wie60]. The 
reader can also consult [Mid60a, pp. 697-712], [Van68, Ch. 6], [Hel91, pp. 540-60], 
and [Pap91, pp. 480-6, 508-11], to cite only a few of the many references. A 
comprehensive bibliography up to 1974 was listed by Kailath [Kai74]. We shall take 



Sec. 6.1 



The Theory of Estimation 



231 



up this topic in Chapter 11 when we study the detection of Gaussian stochastic 
signals in Gaussian noise. 

When as in (6-35) the data v are linearly related to the unknown parameters 
0, with independently random additive errors, what correspond to the maximum- 
likelihood estimators introduced in Sec. 6.1.2 are the estimators 0(i>) determined by 
the method of least squares. If the errors e t are assumed to be independent Gaus- 
sian random variables with equal variances a 2 , the conditional probability density 
function of the data v will be 

p(v\ 8) = C exp[-~l(tf - Wfiv - K9) j, 

with C a normalization constant, and this is maximum when the values of the 
estimanda 0i, 02, ... , 0»j are such that the sum of squares 

(v - m T (v - Re) = x h - X k 'j q j\ 

is minimum [Bic77, pp. 94-9], [Hel91, pp. 500-6], [Sch91a, pp. 359-415]. 



6.2 ESTIMATION OF SIGNAL PARAMETERS 
6.2.1 Maximum-likelihood Estimators 



The maximum-likelihood estimates of a set of parameters of a signal s(t; 0) are 
defined in Sec. 6.1.2 as those values of for which the likelihood functional A[v(t)\ 0] 
is maximum. When the ambient noise is white and Gaussian, with unilateral spectral 
density N, this functional is 

A(o(r)l 0] = expj^J^O; WO dt - *)f dt J (6-37) 

as in (2-74). Let us distinguish the amplitude A and write the signal as 

s(t; 0) = Af{t, 0'), 
where 0' is the set of parameters other than the amplitude. Then 

2A C T A 2 C T 

In A[v(0\ A, B'} = — J o /(/ ; 0'M/) dt - [/(/; 0')] 2 dt, 

and maximizing with respect to the amplitude A we find 

, 4 , , [£/('; <wo a] 2 

max In A[i?(f) A, 0'] = (6-38) 

N$[f(t;W)Ydt 

which remains to be maximized with respect to the other parameters. Denoting 
the maximum-likelihood estimates of those other parameters by 0', we write the 
maximum-likelihood estimator of the amplitude A as 

|. J o 7(/;0 f MO^ 



232 



Estimation of Signal Parameters 



Chap. 6 



It is generally difficult to determine the values of the parameters 0' for which the 
right side of (6-38) is maximum. One way of doing so approximately is to build 
a bank of parallel filters, each matched to the signal f{t; 8') for a different set 8'. 
These sets are densely spaced over the (m - l)-dimensiona! region 8' in which the 
parameters 8' are expected to lie. The input v(t) is applied to each of these filters, 
and the output of each is sampled at the end of the observation interval (0, 7"), 
squared, and divided by the denominator of (6-38). The value of 8' for the filter for 
which the resulting quantity is largest is then approximately the maximum-likelihood 
estimate. 

If the points 0' of the space & for which we build filters matched to the signals 
/(/; 8') are close enough together, the noise components of the outputs of these filters 
will be highly correlated. By interpolating the observed values of 

<[/(/; ej)] 2 dt 

and finding the maximum of the interpolated function r(G'), we can then expect 
to be able to estimate the parameters 0' nearly as accurately as though — what is 
impossible — we had filters matched to signals /(/; 0') for a continuum of values 

ofe'. 



6,2.2 Estimation of Arrival Time 



As an example, suppose that the estimandum is the arrival time t of a signal 

s (/M,T) = 4Ar-T). 
Then by (6-38) we must find the maximum value of 

f(T) = maxmAMOU,?] = L ! 



(6-39) 



If we assume the observation interval to be much longer than the duration T' of 
the signal f(t) (T >s> T') and neglect the possibility that the signal may overlap one 
end or the other of the interval (0, T), the denominator of (6-39) is independent of 
t, and we must maximize 



G(t) = 



CfO -i)v{t)dt 

JO 



f(u - t)u(w) 



(6-40) 



In order to see how to generate this function G(t) in real time, let us consider a 
filter matched to the signal /(/) over an interval (0, T% outside of which we assume 
/(/) = 0; T' <£. T. It will have an impulse response 



< s < T\ 

s < 0, s > T', 



(6-41) 
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and the output of this filter when the receiver input v(t) is applied to it will be 
<>o(0 = f * f(T' - s)v(i -s)ds = ("7(7" - s)v(t -s)ds 
= f f(T> - t + u)v(u)du. 

J -OS 

By comparison with (6-40) we see that 

6?(t) = [w (t + T% (6-42) 
Hence the maximum-likelihood estimate t of the arrival time is given by 

t = t m - r\ 

where t m is the time at which the rectified output \vq(i)\ of the matched filter is 
maximum. That output will have many peaks, and the highest of them identifies the 
time t m . It is unnecessary to build a bank of filters matched to signals with different 
values of t; a single filter, matched as in (6-41) to /(/), will do. 

The question now arises how accurate is this estimate t of the arrival time t. 
We gauge that by the mean-square error £[(t - to) 2 ], where t is the true value of 
the estimandurri t. In order to evaluate it, we must assume that the signal-to-noise 
ratio is large. We can then drop the absolute-value signs in (6-42) and maximize 
only 

<?(t)= f f(t~T)v(t)dt; 
Jo 

that is, we solve the equation 

Cf(t ~7)v(t)dt = 0, (6-43) 
Jo 

the prime indicating differentiation with respect to the argument of the function. 
When the signal-to-noise ratio is large, the root of this equation corresponding to 
the largest peak value of G(t) will lie close to the true value t of the arrival time, 
and we can make the power-series expansion 

f{t - t) =/'(/ - to) - (t - T ) f"{t - to) + - 



and neglect terms of higher order in (t - to) than the first. Putting this into (6-43), 
we find for the error in the estimate 

To « 4^ (6.44) 



The expected value of the numerator in (6-44) is 

E Cf(t - T )[Af(t - T ) + »(«)] dt'A [ T f{t - T )f{t - to) dt 
Jo Jo 

= f{t/^-To)l 2 -[/(-T )] 2 }, 
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and this expected value vanishes because of our assumption that the signal does 
not overlap either end of the observation interval (0, T). That of the denominator, 
however, does not: 

rT rT 

E f"{t - n)v{t) dt^A /"(/ - T )/(f - to) dt 
Jo Jo 

-T 



= -A 



lf'{i-^)fdt 7*0. 



The numerator of (6-44) is therefore of first order in the noise and has no component 
due to the signal, and within terms of first order in the noise we can replace the 
denominator by its expected value, obtaining for the error 

\lf'{t-^)n{t)d t 

T — To « ^ • 

A\l[f{t~^fdt 
In this limit of large signal-to-noise ratio, therefore, 

E(ft - to) » 0, 

and the estimator is said to be asymptotically unbiased. We shall find this generally 
true of maximum-likelihood estimators. 
The variance of the estimator is 

Vart = [^j^i/'COP £ {[fV(* -To)»(0rf'] 2 

when the observation interval is much longer than the duration of the signal, and 
by an analysis like that in (2-25) we find 

/' ( ' ~ T0)W( ° * } = f fj f ' it)]2 dt ' 



whereupon the error variance is 



Var t = 



N 
2A 2 



[f>(OP*j' = 



N 



(6-45) 



with 



E = A 2 



the energy of the signal and 



p2 . S-J/W dt 



(6-46) 



For a low-pass signal, p is the root-mean-square (rms) bandwidth; in terms of the 
spectrum F(u>) of the signal, defined as the Fourier transform of/(r), 



2 . ny iwi 2 g 



(6-47) 
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Thus the variance of the error in an estimate of the arrival time t is inversely pro- 
portional to both the signal-to-noise ratio d 2 = 2E/N and the square p 2 of the rms 
bandwidth of the signal. 

6.2.3 Asymptotic Variance of 

Maximum-likelihood Estimators 

Our result for the variance of the maximum-likelihood estimator of the arrival time 
t in the limit of large signal-to-noise ratio is a special instance of a general formula 
due to Fisher [Fis22] for the asymptotic variance of maximum-likelihood estimators. 
We derive it for a finite set v = (vu v 2 , ... , v n ) of data, passing later to the limit of 
an infinite number of samples. 

The maximum-likelihood estimator of a single parameter 6 maximizes the joint 
probability density function p(v\ 6) of the data or, what is the same thing, its loga- 
rithm 

*(*|8) = Jn/>(!>|e). (6-48) 
The estimate 6 is a zero of the function 

■jfog(v\ B) = gM 6), (6-49) 

a subscript 8 indicating differentiation with respect to 6. Expanding this function of 
9 in a Taylor series about the true value 6o of the estimandum, we must solve 

ge(»l $) - g*(v\ 9 ) + (S - %)gw(v\ 6 ) + - = 0. (6-50) 

Again assuming that the signal-to-noise ratio is so large, and the estimate so accurate, 
that terms of higher order in (6 - 8o) can be neglected, we find for the error 

e-e *-MiM (6-51) 

The numerator of (6-51) has expected value zero, as can be shown by differentiating 
the normalization condition 



p(v\ 0) d*v = 1 

'ft 



with respect to 6: 
d_ 



= f *e(»l e)p(v| 8) d"v = E[g«(v\ &)] = 0. 



(6-52) 



Here E refers to an expectation with respect to the joint probability density function 
of the data v. 

The expected value of the denominator of (6-51), on the other hand, does not 
in general vanish, for differentiating once more we find 

f &M e)p(v\ 6) d n v + f A (v| fOM^I e ) d " v 

= f gm(v\ B)p(p\ 6) d n v + f [g G (v{ B)fp(v\ Q) d n v = 0, 
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whereupon 

E\g w (v\ 6)] = ~E{[g a (v\ 6)] 2 }, (6-53) 

and E{[ga{v\ 6)) 2 } f unless the probability density function p(v\ 8) does not really 
depend on the parameter 8 at all. 

The numerator of (6-51), therefore, is proportional to the noise alone; its signal 
component must vanish. The denominator is of the form "signal plus noise." To 
terms of first order in the noise, the denominator of (6-51) can thus be replaced by 
its expected value, 

oa~ Q) ... 

9 - tt » — =7 — 7Tm7> (6-54) 

and the maximum-likelihood estimator Q(v) is asymptotically unbiased: 

The variance of this estimator is obtained by squaring (6-54), averaging, and using 
(6-53), 

' Var ^ )K ^^ (6 " 55) 
and this is Fisher's formula. 

If in this analysis we divide p(v\ 9) by po(v), the joint probability density func- 
tion of the data when no signal is present, our result will be unchanged, for po(v) 
is independent of the parameter 9. Then because in the limit of infinitely dense 
sampling of the input v{t) 

Po(v) 

the variance of a maximum-likelihood estimator 6[z?(/)] based on the input v{t) to 
our receiver is 

Var9[z>(/)] « 1 (6-56) 

War g 6 [v(t)\ 6 ] 

Here 

g[v{t)\ 6] - In K{v{t)\ 6], 

and gd[v(t)[ 6] is its partial derivative with respect to the estimandum 6. Into this we 
put the true value 6 = 9 of the parameter 6. By virtue of (6-53) this variance can 
also be written as 

Var 6>(03 * * lfl11 , (6-57) 

which may sometimes be more easily evaluated. Keep in mind that these results are 
approximations valid when the signal-to-noise ratio is so large that the errors are on 
the average small. 

The reader is invited to show that (6-56) and (6-57) reduce to (6-45) when the 
parameter 6 is the arrival time t and we assume that the amplitude A of the signal 
Af{t - t) is known and not to be estimated. 

When several parameters e = (0 1; 6 2 , ... , B m ) are to be estimated, their 
maximum-likelihood estimators are again asymptotically unbiased in the limit of 
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large signal-to-noise ratio. The estimators may be correlated, and the generalized 
Fisher formula 

Cov(8 y , § A ) = E[(hj - e /)(% - Boa)] = B jk m (T~ l ) Jkt (6-58) 

corresponding to (6-56), approximates their covariances. Here r is an m x m matrix 
whose elements are 

T * = Cov {| • i } = "4 apt} <«-*» 

into which one substitutes the true values of the estimanda 0. The equality of these 
two forms will be demonstrated subsequently. The matrix T is called the Fisher 
information matrix. 

Denoting a partial derivative with respect to the jth. parameter 0, by a subscript 
Bj, we write the counterpart of (6-50) as the multivariate Taylor expansion about the 
point 8o, 

. f» 

g*M 8) w gsM 6 ) + £ (9* - O fe)ft,e*(wl ) * 0, 1 < / < w. 

This provides approximate equations for the estimates 0*. At large signal-to-noise 
ratio, we. can replace the random variables 

£e,e*(p|©o) 

by their expected values 

as in (6-59). Solving the resulting simultaneous equations for the errors, we find 

m 

^ - 8 / w £ ( r_1 )*Se*(»l e o), 1 < i < m. 
k-i 

If we now introduce a column vector T whose elements are the errors 

Ti = 0,- - Ol - 

and a column vector G whose elements are the random variables 

G k = go k (v\ 0o), 
we can write this set of equations in matrix form as 

T * r- ! G. (6-60) 
The covariance matrix of the errors is then 

B = E(TT T ), 

where superscript T indicates the transpose of a vector. 
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By twice differentiating the normalization integral for p(v\ 6), we obtain 



R„ 



gsM®)P(v\ 6) d"v 



gwM Q)p(v\ 6) d"v + g Sk (v\ $)g e ,M V)p(v\ 0) d R v 
= £[ge,*(»l n + E\g 9i (v\ V)g 9i (v\ 6)) - 0. 

Hence 

T lk = £[g Qi (v\ Q)g 0k (v\ &)] = Cov{g 9( , g , }, ■ 
as in (6-59), or in matrix notation 

T = E(GG r ). 

Thus from (6-60) we obtain 

b * E{r~ l GG r r~ l ) = r-'nr 1 = r H , 

which, after we again pass to the limit of infinitely dense sampling of the input vie), 
is the matrix form of (6-58). 

We shall apply (6-58) in Sec. 6.3 to the simultaneous estimation of the arrival 
time and carrier frequency of a narrowband signal such as a radar echo. 

6.2.4 The Cramer-Rao inequality 

The Fisher formulas (6-56) and (6-58) approximate the mean-square error and the 
covariance matrix of the maximum-likelihood estimators when the signal-to-noise 
ratio is large. A bound on the mean-square error of any estimator is provided by the 
Cramer-Rao inequality [Cra46, Sec. 32.3], [Rao45]. For any estimator 9(d) satisfying 
certain easy conditions of good behavior, the mean-square error is subject to a lower 
bound 

£[( e W -9fl>^L, (6-61) 



where ge(v\ 8) is defined as in (6-49) and 



e{Mv\ e)] 2 }' 



6(6) = E[Q(v)\ 6] = hv)p(v\ 6) d"v. 



(6-62) 



The numerator in (6-61) equals 1 if Q(v) is an unbiased estimator, 

We demonstrate (6-61) as follows. From (6-62) we obtain by differentiation 



dQ 
dQ 



Q{v)ps(v\ 6) d"v = 8(z7)geM *)p(v\ 6) d"v. 



Multiplying (6-52) by 6 and subtracting from this, we find 



de 



W(v) - 6Jgo(vl B)p(v\ 6) d"v = E{[6(v) - e]g B (w| 6)}. 



The Schwarz inequality for expectations (4-73) states thai for any random variables 
A and B 

[E(AB)f < E(A 2 )E(B 2 ). 
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Taking A = 8(i>) - 8 and B = g e {v\ 0), we obtain 



^25 j ^£{[e^)-e3 2 }£teHe)ft 

whence (6-61) by division. 

As an example consider the estimation of the mean m of n independent Gaus- 
sian random variables v ~{v\,vi v„), treated in Sec. 6.1.3. From (6-6) 

g(v\ m) = \np{v\ m) = -~ Y (v k -mf-~ m(2irS 2 ). 

28 fcx 2 

Here the logarithmic derivative is 

g m {v\m)^ n{X ~ m \ _ (6-63) 

where X is the sample mean defined in (6-9), and the denominator of (6-61) is 

E{\gM m)f] = Var Sm {v\ m) = Var X = ~ 

One estimator of the mean m is the biased one in (6-10), and for it, by (6-1 1), 

<m = p 2 

dm p 2 + 5 2 /« ' 
The Cramer-Rao inequality (6-61) then states that 



— S 2 
£{[m(v) - mf\ > 



s 2 r p 2 r 

n 1 2 + 8 2 /« J ' 



The left side is the risk R[m(v)\ m] given in (6-31), and it is indeed greater than 
the right side except when m ~ \x,, whereupon the two sides are equal. For the 
maximum-likelihood estimator m(v) = X, on the other hand, both sides of (6-61) 
. are equal to h 2 /n for all values of m, and the Cramer-Rao inequality becomes an 
equality. 

When for a particular estimator h(v) the two sides of the Cramer-Rao inequal- 
ity (6-61) are equal, the estimator is said to be efficient. The Schwarz inequality 
(4-73) is an equality if and only if the random variables A and B are proportional, 
A - kB, k nonrandom. The condition for an efficient estimator is therefore that 

g*(v\ 6) = ^[hp(v\ 9)] = fc(%)$(v) - 0], (6-64) 

in which k(Q) may depend on the estimandum 6, but not on the data v. 

An efficient estimator is unbiased, for upon taking the expected value of (6-64) 
with respect to the distribution of the data, we find by (6-52) that 

E$(v) - B) = 0. 

For this estimator, when we square (6-64) and average, 
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whence the mean-square error of the efficient estimator is 

£{[§(<;)- 9] 2 } = j—. (6-65) 

Integrating (6-64) with respect to 8, we find 

f 6 

In p(v\ 0) = *(e')[e(i;) - 0'] c/0' + In r(v), 

in which In r(v), independent of 0, serves as a constant of integration. Hence the 
probability density function of the data must have the form 

p(v\ 0) = r(v)H(Q(v); 9). 

Any Bayes estimator of the parameter will therefore depend on the data v only 
through the function 0(z>), which is therefore a sufficient statistic for estimating the 
parameter 0. An efficient estimator is always sufficient, but the opposite may not be 
true. 

For the efficient estimator rh(v) = X of the mean m ofn independent Gaussian 
random variables, Ar(0) = n/h 2 , as appears by comparing (6-63) and (6-64); and 
(6-65) then confirms that its variance is 8 2 /«. Problem 6-2 directs you to show 
that the maximum-likelihood estimator of the variance of n independent Gaussian 
random variables with expected value is also efficient. On the whole, however, 
efficient estimators are rare. 

The Cramer-Rao inequality can be expressed in terms of the input v(i) to the 
receiver by again dividing the joint probability density function p{v\ 0) of the data 
by the joint density function p§{v ) of the data when noise alone is present, and it 
becomes _ 

with ge[v(t)\ 0] = | In A[v(t)\ 0] as before. 

We learned in Sec, 6.2.3 that when the signal-to-noise ratio is large, the 
maximum-likelihood estimator is asymptotically unbiased, and the numerator of 
(6-66) becomes equal to 1. Comparing (6-66) and (6-56) we see that the right sides 
are then the same, and in this limit the Cramer-Rao inequality (6-66) becomes an 
equality. We say, therefore, that the maximum-likelihood estimator is asymptotically 
efficient. 

For a set of m unbiased estimators 0/0?), 1 < / < m, of the parameters 
6 = (0i, 02, • •• , 0/«), there exists a matrix counterpart of the Cramer-Rao inequality. 
The elements of the covariance matrix B of the errors are 

B i} = E{[h;(v) - O f][0;(X>) - 00;]} = Covf^), %j(v)}. 

Then 

B > r- 1 , (6-67) 

where T is the Fisher information matrix defined in (6-59). The matrix inequality 
(6-67) means that the matrix B - T~ l is nonnegative definite. That is, for any column 
vector Y of coefficients 
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covariances of maximum -likelihood estimators of arrival time and carrier frequency 
of a narrowband pulse signal is much simplified, as we shall see in Sec, 6.3. 



6.2.6 The Ziv-Zakai Bound! 

For an unbiased estimator of the arrival time t of a signal s(t - t), the Cramer-Rao 
inequality (6-66) becomes 

m - t) 2 ] > (6-74) 

where d 2 = 2E/N is the signal-to-noise ratio and p is the rms bandwidth defined 
in (6-46) and (6-47). This inequality follows from the identity of the right sides of 
(6-45), (6-55), and (6-66). For a rectangular signal of duration T', 



s(t) 



= \ A > °*' <r ' , (6-75) 

1 o, t < o, t > r, 

however, = oo, and (6-74) becomes the trivia] inequality £[(t — t) 2 ] > 0. 

A more useful lower bound on the mean-square error of an estimate of signal 
arrival time t was discovered by Ziv and Zakai [Ziv69]. We shall briefly sketch their 
derivation. Consider a communication system sending equally likely binary digits 
and 1 every T seconds by transmitting either the signal s(t ~ t\) (hypothesis H\) or 
the signal s{t - T2) (hypothesis H 2 )l t 2 > tj. Each signal is assumed to arrive well 
within the observation interval (0, T). The optimum receiver, according to what 
we learned in Chapter 2 — see Problem 2-7 — passes its input v(t) through a filter 
matched to the signal s(t — i 2 ) — s(t — tj) and samples its output at time t ~ T. 
If that output is positive, it chooses hypothesis H 2 \ if negative, hypothesis H\. The 
solution to Problem 2-8 informs us that the minimum attainable probability of error 

is 

■P e ,min = erfc dij{l - \)/2, d 2 = — , (6-76) 

where E is the energy of each signal, TV the unilateral spectral density of the noise, 
and 

X = \(T 2 - T] ) = -|J j(t - Tl )s(f - t 2 ) dt. (6-77) 

We neglect the possibility that either signal overlaps one end of the observation 
interval or the other; T » T'. 

An alternative and inferior detector estimates the arrival time t of the re- 
ceived signal and chooses hypothesis Hi if t < £(ti + t 2 ) and hypothesis H% if 
t > |(ti + t 2 ). Its probability of error will be 

\ Prp > i(Ti + T 2 )j Hi] + i Pr[T < 1(ti + t 2 )| H2I 

and this must be greater than P e ,mm- Therefore, with 8 = t 2 — %\ > 0, 

i Pr(T - Tj > ±S| Hi) + \ Pr(T 2 - t > i8| H 2 ) > P e , min . 

Pr(T - x, :> id| H^ < Pr(|T - t,| > £8| Hi), 
Pr(T 2 - t > iS| H 2 ) < Pr(|-r - t 2 | > |8| H 2 ), 
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we can replace that inequality by 

3 Pr (^ " T il * 5*1 Hi) + { Pr(|T - t 2 | > {h\ H 2 ) > P t ,, min . (6-78) 

At this point we need the Chebyshev inequality, which for any random variable 
x having finite variance and for any a and g > states that 

H\x-a\>g)< E[{X ~ a)1 \ (6-79) 
S 

To prove it, we write, in terms of the probability density function p{ ■ ) of jc, 

r 00 

E[{x~af] = (x-afp(x)dx 

J— 00 

rii~g /-co 

> (x ~ a) 2 p(x) dx + (x~ afp{x) dx 

Ja+g 

ra-g r 00 

> g 2 p{x) dx +g 2 \ p{x) dx = g 2 Prflx - a\ > g), 

J— co . Ja+g 

and we divide by g 2 . 

Applying (6-79) to (6-78), putting g = and using (6-76), we find the in- 
equality 

\E[{* ~ t) 2 | t = Tl ] + \E[(r - t) 2 | t - T 2 j > i5 2 erfc ^izi^. 

If we define e 2 as the maximum value of £[(t - t) 2 | t] for any true value of the 
arrival time in < t < T, then 



s 2 > i 8 2 erfc^^^. (6-80) 

The tightest lower bound is obtained by maximizing the right side as the separation 
5 varies over < 8 < T. 

If the signal vanishes outside an interval (0, T ! ), X(8) = for 8 > V by (6-77), 
and the right side of (6-80) is largest for 8=7, whence 

fi2 -^ 2erfC ^- (6_81) 
In [2iv69] this is called the external bound. 

Specializing now to a rectangular signal of duration V as in (6-75), we find 
from (6-77) that 

m = f i - ~> -r < 8 < V 

I 0, [5| > 7", 

and (6-80) becomes 



mi 



S* > ±& e r fc| d[ ^ ) | = ~^b 4 erfc b, < S < 7", 



lR2 or pJ \ 1 — (.4 
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with b the argument of erfc (•). The right side is maximum for b = 1.812, and we 
find 

y/2 

s 2 > 0.3772-rr (6-82) 

or 

provided d 2 = 2E/N t lb 1 ~ 6.566. This is called the internal bound. For large 
signal-to-noise ratio d 2 , this lower bound decreases more rapidly with increasing 
signal-to-noise ratio than does the right side of the Cramer-Rao inequality (6-74) 
for signals with finite bandwidth (3. For d 1 = 2E/N < 6.566, one uses the external 
bound (6-81), which depends on the duration T of the observation interval. 

The change from one type of bound to the other when the signal-to-noise ratio 
d 2 passes a certain limit exemplifies what is called the threshold effect. At large 
signal-to-noise ratios the main source of error is the displacement by the noise of 
the highest peak of the output of the matched filter. When the signal-to-noise ratio 
is low, one of the many peaks of that output may surpass the height of the peak 
occurring near the time to + T' when the output would be maximum in the absence 
of noise. The result is an error that may be a considerable fraction of the duration T 
of the observation interval. This is why the duration T appears in the bound (6-81). 

For signals of finite bandwidth fj, [Ziv69] presents a lower bound of the same 
form as in (6-74), except with a different numerical coefficient: 

J > M £ > 2.88. 

Refinements involving the use of other probability bounds than the Chebyshev, but 
requiring additional assumptions about the probability density function of the esti- 
mator r[v(t)], lead to similar forms for the lower bound on the mean-square estima- 
tion error, but with larger multiplicative factors. Remember that (6-81) and (6-82) 
are only lower bounds to the mean-square error £[(r - t ) 2 ], and resist the tempta- 
tion to regard the right side of either as an approximation to the actual mean-square 
error in an estimate of the arrival time t. 



6.3 ESTIMATION OF PARAMETERS OF A 
NARROWBAND SIGNAL 

6.3.1 Arrival Time 

The maximum-likelihood estimation of the arrival time t of a pulse signal Af (t - t) 
was shown in Sec. 6.2.2 to require timing the peak value of the rectified output 
\vo(t)\ of a filter matched to the signal f(t). The mean-square error of the resulting 
estimate was given by (6-45) as 

Var t « 



in the limit of large signal-to-noise ratio d 2 = 2E/N, with p the rms bandwidth of 
the signal as defined in (6-46) and (6-47). 
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In radar the transmitted and received signals are narrowband modulations of 
a high-frequency carrier, and the delayed echo has the form 

s(t; t) = A Re[f(r - t) exp UXt - t)], 

in which F(t) is the complex envelope. For a signal of this kind, as can be seen from 
(6-47), the bandwidth (3 is roughly equal to the carrier frequency 11, which is much 
larger than the bandwidth Aw of the envelope F{t). Then (6-45) becomes 

Var ** dW> (6_83) 

which asserts that when the signal-to-noise ratio d 2 is large, the arrival time t of the 
radar echo can be measured within a fraction of a period of its carrier. Because the 
range r of the target equals 2ct, where c is the velocity of electromagnetic radiation, 
this implies that 

Varr^V-Var^^^, 

where X = 2irc/fl is the wavelength of the radiation; and when d 2 » I, the range 
can be estimated within a fraction of that wavelength. 

To attain such accuracy the oscillations of frequency 20 in the squared output 
[vo(t)] 2 of the matched filter must be observed closely in order to ascertain which is 
highest. Only when the signal is very strong will this be possible without ambiguity. 
With weaker signals, even though the output stands well above the noise level, the 
noise may give a neighboring cycle of those oscillations a greater excursion from 
the zero line, and the error in the estimate will be some multiple of their period 
tt/TI- If the measurement is repeated several times, the observed maximum will 
jump erratically from one cycle to another in the vicinity of the time t + T'. 

When this is happening, knowledge of the phase of the transmitted pulse is 
no longer of any use; and even if that phase were uncontrolled, the estimate of the 
arrival time would not be much altered. This estimate and its accuracy will be nearly 
the same as for the time of arrival of a quasiharmonic signal 

s(t;A, t) - A Re F(( - t) e'^'' 11, 

of unknown phase 

Because the phase of the signal is unknown, the receiver may as well rectify 
the output of a filter matched to the signal 

Re F(t)e itl1 

and filter off the components of frequency 20 in the output of the rectifier, to 
produce 

= F\T' -s)V(t ~s)ds 
h 

where V(t) is the complex envelope of the input Re V{t) exp t'Qt, 

V(t) = AF{t ~ t) e h]! + yv(0; 

V is as before the delay in the matched filter, and N(() is the complex envelope of 
the noise. 
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S(t;r) 




o r 

(b) 



R s (t) 



A Figure 6-1. Estimation of target range, 
(a) Complex envelope of signal arriving 
T t + T' t at l ' me T- ^) Complex impulse response 

of matched filter, (c) Signal component 
(c) of rectified output of matched filter. 

We shall see in the next part that timing the peak value of the function R(t) 
indeed yields the maximum-likelihood estimate of the arrival time t of the envelope 
of this signal. Figure 6-1 illustrates the complex envelope of the delayed signal pulse, 
the complex impulse response of the matched filter, and the signal component R s (t) 
of the output R(t) of the rectifier, that is, the form that output would take, were no 
noise present. The noise displaces the peak of the rectified output R(t) from that of 
R s (t) and thus introduces a random error into the estimate of the arrival time t. 
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It is plausible that as will be demonstrated in Sec. 6.3.4, in the limit of large 
signal-to-noise ratio the variance of the maximum-likelihood estimator of the arrival 
time t is expressed as in (6-45), except that the bandwidth (3 must be replaced by the 
rms bandwidth Aw of the complex envelope of the signal, 

Var**^. . (6-84) 
The mean-square frequency deviation Aw 2 is defined by 

with /(o>) the Fourier transform of the complex envelope F(t) of the signal, as in 
(3-7), and to the mean deviation from the carrier frequency, 

Swelling [Swe59], [Swe64] showed that the arrival time can be determined with 
a variance given by (6-83) when the signal-to-noise ratio is so large that d » H/Ato. 
When 1 <sc d « U/Aw, however, the variance is limited by l/t? 2 Aur as in (6-84). It 
is not so hard to see why the dividing line between the ranges of validity of the two 
formulas should occur at a signal-to-hoise ratio on the order of fl/Ato. In order to 
measure the time t within a fraction of a period of the carrier, it must be possible 
to locate the peak of the envelope of the output of the matched filter with an error 
somewhat smaller than tt/11. Hence the signal-to-noise ratio must, by (6-84), be 
large enough that 

(Var*)" 2 

jaw a 

from which we obtain the condition d » fl/(irAo>). 

The system just derived closely resembles a conventional radar range- measuring 
device. The intermediate-frequency amplifier of the radar corresponds to the 
matched filter, although it may not be precisely matched to the signal, but may 
merely have an approximately equal bandwidth. The target range is measured by 
timing the peak of the rectifier output as displayed on an A-scope. This output 
has many peaks due to the noise, but whenever the signal-to-noise ratio. suffices for 
practically certain detection, one peak caused by the signal stands out above the 
rest. The noise displaces the highest point of this peak by an amount whose mean- 
square magnitude is given approximately by the variance Var t in (6-84). The greater 
the bandwidth Aid of the signal, the more accurately the time of arrival t can be 
estimated. 

6.3.2 Signal Arrival Time and Carrier Frequency 

If the radar target is not stationary, as assumed in Sec. 6.3.1, but is moving toward 
or away from the antenna, the carrier frequency fi of the echo differs from that 
of the transmitted pulse because of the Doppler effect. If we pick the transmitted 
frequency as our reference frequency fl,., the echo will have the form 

s(t; t, u') = A Re F(t - t) exp[/(w + Sl r )(t - t) + ity], (6-87) 
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where A is its amplitude, its phase, t its epoch, and w - il — £l r the change in its 
carrier frequency. This Doppler shift w is given by 

2v 

w = a - a = —Or, (6-88) 

c 

where v is the component of target velocity in the direction of the radar antenna and 
c is the velocity of electromagnetic radiation. A derivation of this formula, which 
holds when v c, can be found in Appendix F. 

The pulse envelope F(t) is also compressed by a factor (1 + 2u/c) -1 , but for 
targets of ordinary velocities this factor differs negligibly from 1. For a target moving 
at a rate of 500 mph and for a carrier frequency £l r = 2tc • 3 ■ 10 9 rad/sec (3000 
MHz), the frequency shift is w = 2-n • 4.5 • 10 3 rad/sec (4500 Hz), an appreciable 
fraction of the 1-MHz bandwidth typical of a radar pulse. For the much larger 
velocities encountered in tracking missiles and satellites, the Doppler shift will be 
even greater. 

When the echo signal suffers a Doppler shift comparable with its bandwidth, 
the response to it of a filter matched to the transmitted pulse is much reduced, and 
the signal may be missed in the noise. Only if the target velocity is known can 
the receiver filter be properly matched to the echo pulse. If the Doppler shift can 
be measured, on the other hand, the observer can calculate the component of the 
target velocity in the direction of the radar antenna, obtaining valuable information 
for tracking the target efficiently. The possibility thus arises of measuring both the 
distance of the target and its velocity by estimating the time t of arrival of the echo 
signal and the frequency shift w of its carrier. 

As the time t of arrival is unknown, we can combine the term -ft,T in (6-87) 
with the unknown phase <|r and write the signal instead as 

s(t; A, ty, t, tv) = A Re F(t; t, w) exp[iQ, r t + /# 

with 

F{t; t, w) = F(t - t) e'^K (6-89) 

This signal is received in the presence of white, Gaussian noise of unilateral spec- 
tral density N, during an observation interval (0, T) that is so long that signals 
overlapping its ends can be disregarded. 

Let us designate by 0' those parameters other than the amplitude A and the 
phase in this problem 6' = (t, w). About the amplitude A we have no prior 
information, and the phase i|f can be assumed uniformly distributed over (0, 2ir). 
Equivalently, with 

A J* = u + iv, 

the components of the complex amplitude u + iv can be taken as independently 
random. Although we have no interest in their values, we estimate them anyhow, 
avoiding the necessity of postulating a prior density function for them. 
For the complex envelope of the signal we put 

S(t; 8) = (» + iv)S(t; 6'), 6' = (t, w). 

The logarithm of the likelihood functional for detecting this signal in white noise is 
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UJZ ^\ S\t\ Q'WiO dt - U -^~~[ \S(t; 0')l 2 dt (6-90) 

by (3-54) with Q(t; 0) = iV" 1 ^; 0) from (3-52) and (3-45). The maximum- 
likelihood estimates of the parameters u, v, and 0' = (t, w) are those values for 
which (6-90) is maximum. In terms of the random variables x and y defined by 



z = x + iy = if S\t;tf)V{t)dt 

N JO 



and the reduced ambiguity function 

tf'(e',,e 2 ) = ^RejV(f;ei)5(/;ei)rf/, 
we can write (6-90) as 

In A[u(0| 6] = ux + vy - \(u 2 + v 2 )H'(Q', 0'). (6-91) 

Differentiating first with respect to u and v and setting the results equal to zero, 
we find for the maximum-likelihood estimates u and v of these components of the 
complex amplitude 

" + li = Ae " = /^eo = iVHW^O f e ' )K(0 *■ (6 " 92) 

into which we must substitute the maximum-likelihood estimates of the parameters 
9' yet to be determined. Putting (6-92) into (6-91), we see that these are the values 
of 0' for which 



1 2 

max In A[z^/)| 9] = 



2N 2 H ! (&, 0') 



T 

J 



S*(t;Q')V(t) dt 



is maximum. 

In principle the maximum-likelihood estimates 0' could be obtained by building 
a bank of parallel filters matched to signals of the form Ke[F(t; 0') exp i£l r t] for 
closely spaced values of 0' in the space ©' of these remaining parameters. The 
outputs of each filter must be passed to a quadratic rectifier, whose output at the 
end of the interval (0, T) is divided by [2N 2 H'(&, 8')]. The parameters of the filter 
yielding the largest value of the resulting quantity identify the maximum-likelihood 
estimates of 8'. 

When as here 0' = (t, w), and the observation interval (0, T) is so much longer 
than the duration V of the signal that we can disregard the possibility that the signal 
overlaps one end or the other of it, the quantity 



i r°° 

H'(Q\Q')= -j jF(t~r)\ 2 dt 



is independent of t and w and can be dropped. One then constructs a bank of 
parallel filters, each matched to a signal 

R&[F(t) exp i(Cl r + w)t] 
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for one of a closely spaced set of values of the Doppler shift w spanning the range 
of expected Doppler shifts. The complex impulse response of one of these filters can 
be taken as 

K w (s) = \ F * iT '- S)e - mT '- S) > (6-93) 

(o, s < o, s > r, 

where the interval (0, T') is long enough to contain the entire signal. 

The signal component of the output of one of these filters, when the signal 

s{t\ to, w ) = A Re{F(t - t ) exp[?(O r + w Q )t + nfl} 

arrives, will have the complex envelope 

So(t) = A f K w {s)F(t - t - s) exp[w ~ s) + ity] ds (6-94) 
Jo 

r 00 

« A exp(iw t + itf) F*(u)F(t ~ T' - t + u) exp[-~i(w - w )u] du, 

J-00 

with an inconsequential phase. The output Re V (t; w) exp iDrt of this filter is 
quadratically rectified, and the signal component of the output of the rectifier is 

2 



R s (t, w) = A< 



f F*(u)F(( -T'-tq + u) expH(w> - wo)u] du 

J—eo 



(6-95) 



This signal component reaches its largest peak value in the output of that filter 
tuned to a signal with Doppler shift w = w , and that peak value occurs at time 
to = tq + T'. The noise may cause the largest peak value to appear at the output 
of a filter tuned for a different value of w and at a different time, introducing errors 
into the resulting estimates t and w of the time of arrival and the Doppler shift. 

If the spacing Sw between the values of Doppler shift w for which the matched 
filters in the bank are tuned is much less than the reciprocal T'~ ] of the duration 
of an echo, the noise components of the outputs of adjacent filters will be highly 
correlated. It will then be possible to estimate t and w by interpolating among the 
functions |Ko(/; w)) 2 for values of w in the neighborhood of the largest peak output. 
When the signal-to-noise ratio is large, the rms error in the estimate w of the Doppler 
shift can be made rather smaller than the separation hw of the pass frequencies of 
the matched filters, and that rms error will be roughly the same as though one 
had a continuum of matched filters tuned over the entire range of expected carrier 
frequencies Q r + w. 

The simplest way to implement this prescription is to construct many copies 
of a filter matched to the same quasiharmonic signal Re F(t) exp iSl/t, where 12/ is 
a suitable intermediate frequency. Preceding each such filter is a mixer in which the 
input Re V(t) exp iSl r t is beaten against a wave of angular frequency tl r + w - ft/ 
for one of a uniformly spaced set of values of w . Those waves are generated by local 
oscillators whose output angular frequencies are displaced from O r - ft; by signals 
from a frequency multiplier that produces sinusoids of angular frequencies that are 
all required multiples of the spacing §w, 

If the Doppler shift w is known, only a single matched filter is required, and 
we can take w = w in (6-94) and (6-95). The prescription given in Sec. 6.3.1 for the 
maximum-likelihood estimator of the arrival time t then ensues. 
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6.3.3 The Complex Ambiguity Function 



When in Sec. 6.3.4 we calculate the variances and covariance of estimates of the 
arrival time t and the Doppler shift w of a narrowband signal and when wc interpret 
the results, we shall need to consider a complex ambiguity function \(t, w) defined 
by 

r CO 

\<t, w) =\ F(s~ {t)F*(s + |t) e-'» s ds, (6-96) 
in which we assume the complex envelope so normalized that 

\F{t)\ 2 dt = 1, (6-97) 

whereupon X(0, 0) = 1. In terms of the Fourier transform /'(to) of the complex 
envelope F(t) — see (3-7)— the ambiguity function can be written 

100 7 
/(<* + iw)/*(<o - \w) e~'^ ~. (6-98) 
-oo ZTT 

This function will also figure in our analysis of signal resolution in Chapter 10. In 
order not to digress in the midst of Sec. 6.3.4, we shall now describe some of its 
properties. 

In terms of the ambiguity function, the signal component R s (t; w) of the rec- 
tified output of a filter matched to a signal with frequency shift w is, by (6-95), 

R*(t\ w) = ^ 2 |\(/ - i, w - u>o)! 2 , t Q = t + T'. (6-99) 

Think now of a bank of filters matched for signals (6-89) with densely spaced carrier 
frequencies fl r + w, and imagine plotting these rectified outputs R s (t; w) vertically 
over the (t, w)-plane. The resulting surface will reproduce the absolute square of the 
ambiguity function, and its peak will lie over the point (t , w ). 

Expanding the ambiguity function \(t, w) in the neighborhood of the origin, 
we obtain after some labor 

A(t> w) = 1 - iWr - ilw - ^w 2 t 2 - whfT - \t l w 2 + (6-100) 

In this series appear various moments and cross-moments of the complex envelope 
F(t) and its Fourier transform /(to), 



t" = t"\F(i)\ 2 dt, (6-101) 

J-co 

^ = J «"|/Mf~!; (6-102) 

there are no denominators because of the normalization (6-97). The quantity wf is 
defined by 



(oo 
tF*(t)F'(t)dt, 
-co 



(6-103) 



the prime indicating differentiation. The term makes this quantity real, as can 
be shown by integration by parts. 
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Without loss of generality we can choose the origin of time so that t = and 
the carrier frequency so that w = 0. Then w 2 reduces to the mean-square frequency 
deviation Aw 2 defined in (6-85) and t 2 to the mean-square duration 

Lt 1 = ^ (t -lf\F{t)\ 2 dt (6-104) 

J— oo 

of the complex envelope F(t). 

If we express the complex envelope as in (3-5), 

F(t) = M(t) e' m 

in terms of the amplitude and phase modulations defined in (3-2), these quantities 
become 

At 2 = T t 2 [M(t)fdt, (6-105) 

J— 00 

Aw 2 = r [M'{t)f dt + V W(t)f[M(t)f dt, (6-106) 

J— oo J— oo 

t$'(t)[M(t)f dt, (6-107) 

-co 

primes indicating differentiation with respect to time. Here we have assumed the 
origins of time and frequency so chosen that t = and (5 = 0. 

The quantity A(<o*) is thus a weighted average of the product of the time t and 
the instantaneous frequency modulation §'{t); it vanishes for a purely amplitude- 
modulated signal. The mean-square frequency deviation Ato 2 is composed of a 
term representing the time dependence of the amplitude modulation and a term 
corresponding to a weighted squared frequency modulation. The weighting function 
is always the squared amplitude-modulation [M(t)f. 

The outputs of the bank of matched filters introduced at the beginning of this 
part are correlated. The complex envelopes of the noise components of the outputs 
of those filters are m 

V (t;w)= K w (s)N(t-s)ds, 

J— oo 

where N(t) is the complex envelope of the noise. We define their complex cross- 
covariance function as 

<f>(*2 - tx; w u w 2 ) = ^E[V (tu w x )V£{t 2 ; w 2 )\ H ], 

and we want to express it in terms of the ambiguity function in (6-96). 
Using the assumption that the noise is white, 

tElNfaWitz)] = NHh - t 2 \ 

we write it as 

Wi ~ h; wi, w 2 ) = \E P I*" K w fa)K: z (s 2 )N{t x - s x )N*(t 2 - s 2 ) d Sl ds 2 

J— CO J— OO 
p 00 f oo 

= N\ K m WK^fa) Ht\ ~Si-f2+ s 2 ) ds y ds 2 

J—QOJ— 00 

= N r K m (si - t 2 + fi^fe) ds 2 . 
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Putting s ~ (2 - t] , we find by (6-93) that 

F\T' -u +s) exp[-i Wl (r ~ u + s)]F{T' - u) cxp[iw 2 (T' - «)] rf« 
= jV [ ^(7" - u)F*{T' -u + s) expH(w) - w 2 ){T' - u) - iw\s] du. 

J— 00 

After comparing this with (6-96) and appropriately changing the integration variable, 
we find that it can be written as 

<K-y; w\, w 2 ) - N exp[-j/(tvi + w 2 )s]\{s, tvj - w 2 ). (6-108) 

Thus the complex cross-covariance function of the complex envelopes of the outputs 
of the filters tuned for frequency shifts w\ and w 2 is proportional to the ambiguity 
function \(s, w\ - iv^), where s - t 2 ~ t\ is the interval between samplings. The 
dependence of this ambiguity function on the shape of the complex envelope F(i) of 
the signal will be studied in some detail in Chapter 10, where we treat the resolution 
of close signals. 

From (6-100) the squared ambiguity function |X(t, w)\ 2 in the neighborhood 
of the origin is 

|\(t, w)\ 2 « 1 - [A<dV + 2A(ajl)rw + A* V], 
A(wr) - (at - ait, 

through terms of second order. The width of the ambiguity function ]X.(t, w)\ in 
the w-direction is therefore on the order of (A/ 2 )~ ,/2 , and (6-108) shows that the 
correlation of the outputs of the matched filters in our filter bank extends over a 
range of frequencies on the order of the reciprocal of the duration of the signal. 

6.3.4 Calculation of the Error Covariances 

The covariance matrix of the estimators of the amplitude components u and v, the 
arrival time t, and the Doppler shift w is approximately given by (6-58), which 
involves the inverse of the Fisher information matrix F; the input signal-to-noise 
ratio must be large. For narrowband signals received in white noise the elements of 
that matrix are in turn given by (6-71) in terms of the generalized ambiguity function 
H($ u 62) of (6-73). There = ( M , t>, t, w), 

S(t; 6) = (w + iv)F(t - t) exp iw(t - t), 

and 

J E /(e 1 ,6 2 ) = N~ l Re[( Wi - iu,)(«2 + ivi) 

■ f F*(t - n)F(t - t 2 ) exp[iw 2 (t - t 2 ) - iw\(t - T t )] dt\ 
Jo 

Again invoking the long duration of the observation interval, we can write this as 
//(0i, 6 2 ) = ~ Re[(K S -iv s ){u 2 + iv 2 ) e~ iiV:f \(r,w)} (6-110) 
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in terms of the complex ambiguity function defined in (6-96). Here 

t = t 2 - ti, w = wi - w 2 , W ~ \{W\ + H>2). (6-111) 

In order to find the Fisher information matrix we must differentiate the function 
in (6-110) with respect to the components of 6i and 82 separately and then set 61 = 
82 = 0o. When we do so, t and w vanish, and W becomes the true Doppler shift 
w . We normalize the complex envelope F{t) as in (6-97), whereupon \(0, 0) = 1, 
and 221 

N 



N N N 

is the input signal-to-noise ratio. The rows and columns of the matrix T will be 
labeled u, v, t, and w, in that order. 

The differentiations with respect to the elements of the complex amplitude 
u + iv yield immediately 



r =r =1 

* WW 1 VP ^ > 



r„„ = 0. 



We can carry out those with respect to the other parameters most easily by using the 
expansion (6-100) in the neighborhood of the origin. Substituting it into (6-110), we 
obtain . .„,_ 

#(0i, 8 2 ) = N Re[(«i - /'i>i)(«2 + iv 2 ) e' 1 ^} 

• [1 - £Aa>V - A(<o?)w ? - |A? 2 w 2 + •••]. 

Now carrying out the differentiations, using (6-11 1), and at the end setting 

Hi + iV\ = U2 + W 2 = M0 + m, ul +Vq~ A\, 
T\ = T 2 = To, W\ - W 2 = W0, 

we find after some labor that the Fisher information matrix is 



r = d 2 



At 






-AtwQUQ 



Ao 2 WqVq 

-Atmuo 
Aw 2 + wi 
-Mat) 






-A(ojO 
A* 2 



(6-112) 





Our concern here is the 2 x 2 covariance matrix of the estimators r[v(t)] of the 
arrival time and w[y(0] of the frequency shift. When we invert the Fisher matrix T, 
we find that the u and v columns contain terms proportional to the true frequency 
shift wo, but the submatrix containing the variances and the covariance of t and 
w does not. Those terms proportional to w in the « and v columns represent a 
correlation between the estimators of the phase i|* and of the arrival time t and 
frequency shift w, but as we are unconcerned with estimating the phase i|i or the 
components u and v of the complex amplitude u + iv = A exp we shall not 
bother working them out. It suffices then to invert the Fisher matrix T after setting 
wo equal to 0, whereupon it becomes 



[At 








" 





At 














Ad) 2 


-A(w/) 








-A(o>0 


A/ 2 



(6-113) 
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Its determinant is 
with 



detr = d*A^C 



C = Ato 2 A/ 2 ~[A(co03 2 , (6-U4) 
and its inverse is the approximate covariance matrix B of the errors: 



b « r-' = 1 



d 2 C 



*~aIc 

AlC 
At 2 A(wO 



(6- 115) 



A(o>/) Ato 2 

The simplest way to verify this inverse is to multiply the matrices in (6-113) and 
(6-115) and show that the identity matrix results. The approximate covariance matrix 
of the estimators of arrival time and Doppler shift is the lower-right block of (6- 1 1 5), 
with the value of C given in (6-114). In this way we find that the variances of t and 
w and their covariance are 

Varr«-^, Varw*^, CovK ,v} ^ — 

If the echo signal contains no frequency modulation, A(<o£) = and the errors 
in the estimates of arrival time and Doppler shift are uncorrelated: 

Var " " d^' Var "' K Cov( *' * °- (6 ' 116) 

The variance of the estimator of the frequency shift w is inversely proportional to the 
mean-square duration At 2 of the signal. The variance of the estimator of the arrival 
time is now the same as that in (6-84), which applies when the carrier frequency 
O,. + wo is known and need not be estimated. The same result arises if we discard 
the fourth column and row of (6-1 12), set w = in the remaining 3x3 matrix, and 
take its inverse. 

For a given signal shape an increase in the bandwidth Aw entails a decrease in 
the rms duration At, and the variance of the estimate of the arrival time t cannot 
be reduced without at the same time accepting less precise measurements of the 
frequency shift w unless a marked change is made in the shape of the transmitted 
pulse itself. Still considering signals without frequency modulation, we observe 
that the product Aco 2 A/ 2 can be made large by using signals whose spectra, for 
a fixed duration At, are distributed as much toward high frequencies as possible. 
This can be achieved by giving the pulses very sharp corners, but these cannot 
be transmitted unless the antenna and the lines that feed it have large bandwidths 
themselves. Such signals will yield the most accurate simultaneous estimates of 
arrival time and frequency. The increased accuracy will not be gained, however, 
unless the receiver contains filters properly matched to those signals. 

The analysis in Sec. 6.2.3 shows that in the limit of large input signal-to-noise 
ratio the errors in the estimates depend only on the derivatives of In A[d(()I B] with 
respect to the parameters. When the signals are observed in additive Gaussian noise 
«(/), these derivatives are linear in n(t) and therefore themselves Gaussian random 
variables. The errors Q[v(()] - are consequently asymptotically Gaussian random 
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The reciprocal 1/Au> is roughly the width of any one of the pulses in the train 
emerging from the matched filter. 

The envelope of the output-pulse train., shown dashed in Fig. 6-2, would be 
the output of a filter matched to a single signal having the same duration MT r as 
the pulse train, Its bandwidth will be on the order of (MT r )~ ] . Ambiguity can be 
expected with the pulse train (6-118) unless the arrival time of such a single, long 
signal can be timed within a fraction of T rt and this requires 

»«r 2 

The signal-to-noise ratio d 1 must therefore be large compared with M 2 » I if (6- 1 1 9) 
is to hold for a pulse train of length MT r and ambiguity is to be unlikely. 

Ambiguity in measurement of the frequency shift w must also be anticipated 
when a train of pulses is transmitted. To show that, we consider the signal com- 
ponents of the outputs at time to = t + T' of a bank of filters matched to signals 
with a dense set of frequency shifts. According to (6-99) these will be proportional 
to |X(0, w - w Q )\ 2 , where w is the true value of the shift in the incoming signal. For 
\( ■ , ■ ) we shall use its expression (6-98) in the frequency domain. 

The Fourier transform of the pulse train (6-1 18) is 

AM 

/(w) = AT 1/2 £ *(w)expHfc7>), 
*=o 

where 

e(a) = E(t) e~ i(al dt 

J -00 

is the Fourier transform of a component pulse. Summing, we write this as 

1 _ p -iMT f u> 

m = m-^Mt—^ (6 . 120) 

= M l/2 e(oi) exp[-^(M - I)7>]C(w), 

where 

sin \Mr r (n 
Msin \ T r u> 

In Fig. 6-3 we have plotted a portion of the function |C(w)| for M - 20. It consists 
of peaks of height equal to 1 that are separated in angular frequency by 2Tr/T rs 
whose widths are on the order of lir/MTr, and between which are M - 1 much 
lower subsidiary peaks. We can call it a comb function. Because the narrowband 
transfer function of a filter matched to any one of our pulse trains is proportional 
to C(to - w), such a filter has been called a comb filter. 

The width of the transform e(w) of E(t) is on the order of (A / r 2 )~" i/2 J where A't 2 
is the mean-square duration of E(t). The factor e(w) modulates the comb function 
in (6-120), and the width of that modulation is much greater than the separation 
2ir/T r between the tines of the comb. 



C(o>) = 2 ^ . (6-121) 



260 



Estimation of Signal Parameters Chap. 6 









-6n/T r 



-2tt/7; 




w 



2u/T r 



4TT/T r 



6TT/T r 



Figure 6-3. Comb function |C(w)| defined in (6-12!); M - 20. 

According to (6-98), the signal components of the outputs of the matched 
filters at time to are proportional to 



^ 2 |X(0, w - w )| 2 = A 



2tt 



= .4' 



/(« - wo)/*(w - w) 



2^ 



e(w - ii'o) - w)C(u - w )C*(u - w) 



du 2 
2tt 



as can be seen by a simple change of the variable of integration in the first integral. 

The signal component of the output will be largest for the filter tuned for 
a frequency shift w equal to the true frequency shift wq, the tines of one comb 
C{u - ii') then having fallen exactly on those of the other. As our imaginative eye 
moves along the bank and away from that filter, we see the output decrease rapidly, 
the tines having moved apart; and for \w - \vq\ greater than about 2ir/MT r , the 
output will be very small until we reach the filter tuned for a frequency shift equal 
to n'o + 2tt/T,-. There the tines of the function C(u - w) have again coincided with 
those of C(u - w ), and a large output will be found. Likewise, the outputs at time 
to will be large from all filters tuned for frequency shifts equal to w Q ± 2np/T r) 
where p is any integer, provided only that 2-np/T r < (A'/ 2 )~ l/2 . 

At low signal-to-noise ratio the noise may cause the output of a filter tuned for 
one of those other frequency shifts to exceed the output of the "correct" filter tuned 
for w - wq. The result is an error in the estimate of the angular frequency shift that 
is a multiple of 2-n/T,-, and this represents an ambiguity in the measurement of the 
frequency of the radar echo. 

The origin of this ambiguity is easily understood. Because the transmitted 
pulses are coherent, the receiver can measure the change in phase of the r-f carrier 
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from one reflected pulse to the next by comparing the phases of the echoes with 
that of a local oscillator synchronized with the transmitted phase. A target with 
a velocity v moves a distance L = vT r in one pulse-repetition period, and the r-f 
phase of the carrier of the radar echo will change by an amount A<|> = Qq(2L/c). The 
target velocity is then given by v - (A<§>/(h)(c/2T r ). Because the receiver cannot 
distinguish phase changes differing by multiples of 2tt, an ambiguity in the true 
velocity of some multiple of (2ir/Oo)(c/2:r,-) = \\/T r arises; X is the wavelength of 
the radiation. 

If ambiguity in frequency is to be avoided and (6-119) is to hold, one must be 
able to measure the frequency shift of a signal having a spectrum with an error 
somewhat less than the separation 2ir/r r between the tines of the comb function; 
that is, 

1 4tt^ 
d 2 A't 2 <K T 2 " 

The signal-to-noise ratio d 2 must be on the order of or larger than about 
r r 2 /(4ir 2 A'f 2 ), and this quantity will be large when, as usually, the pulses of the 
train are widely separated. 

Problems 

6-1. Given are n independent measurements v\, i>2> ••• > v„ of the noise voltage v at a certain 
point in a receiver. If the noise v is a Gaussian random variable with expected value 0, 
what is the maximum-likelihood estimator of its variance? Calculate the expected value 
and the variance of this estimator as functions of the true variance. Give a sufficient 
statistic for estimating Var v. 

6-2. Show that the estimator of the variance determined in Problem 6-1 is efficient. 

6-3. Suppose that in Problem 6-1 both the expected value and the variance of the voltage 
v are unknown. Work out their maximum-likelihood estimators based on the same n 
measurements. 

6-4. Show that the properties of the MAP estimator in Sec. 6. 1 . 1 follow from the Bayesian 
estimation theory of Sec. 6.1.4 when the cost function has the bizarre form 

c(e, e) = a - Bb(B - e), b > o, 

where 8(-) is the m-dimensional delta function and m the number of estimanda. 
6-5. Analyze the estimation of the mean m of a set of n independent Gaussian random 
variables, treated in Sec. 6.1.3, in the framework of Sec. 6.1.4. Assume for simpUcity 
that the prior mean u, equals 0. That is, consider the observed data as having the form 

Vk — m + ek, k — I, 2, ... , n, 

in which m is a Gaussian random variable having expected value and variance p 2 , 
and the errors are independently Gaussian with expected values and variances 8 2 . 
The errors are of course independent of the estimandum m; see (6-35). Use the method 
of Sec. 6.1.4 to find the MMSE estimator of the mean m. 

6-6. Show how to obtain the maximum-likelihood estimator of the azimuth 6 of a fixed 
target on the basis of echoes received by a radar antenna scanning at a uniform rate. 
The received signals can be taken as 

fkit) - ReAr(Q Q + kZ)F(t) exp(itlt + < ( < 7\ 
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where the amplitude A, the azimuth Q a of the target; and the phases are indepen- 
dent and unknown, and all are to be estimated. The beam pattern of the antenna is 
represented by r(8). The noise is white and Gaussian with unilateral spectral density 
N. It is assumed that the transmitting antenna is fixed and that all pulses incident 
on the target have the same energy. Show from the Fisher formula or otherwise that 
at large signal-to-noise ratio the variance of that maximum-likelihood estimator of the 
azimuth is inversely proportional to the total signal-to-noise ratio d\ = 2E r /N and 
directly proportional to the mean-square beamwidth ® 2 , defined by 

2 _ J>(e)p de 

The prime indicates differentiation. Here E T is the total received energy. Assume that 
the angle S through which the antenna turns between reception of one echo and the 
next is small enough that summations over k can be approximated by integrations over 
azimuth. 

6-7. Let v{i) be a realization of Gaussian random noise of autocovariance 4>(/, s) = Br(t y .?), 
where B is positive, but unknown. If v(t) is given over only a finite interval (0, 7"),. 
the multiplicative constant B can be estimated as accurately as desired. To show this, 
consider the estimator 



where 



•T 

Vk = I fk(0v(t)di, 
Jo 

with {/*(/)} the orthonormal eigenfunctions and {\k) the eigenvalues of the integral 
equation 

-T 



V(0 = f r(t, s)f(s) tts, < / < T, 
Jo 



>m > M > ■ ■ ■ > K > 0. Show that b„ is an unbiased estimator of B and that Var b„ — * 
as n ~* oo. Hence the larger the number of terms in b„, the greater is the accuracy 
of the estimator. 

6-8. The linearly rising signal 

s(l) - a + b! 

is observed during an interval (0, T) in the presence of white, Gaussian noise n(t) of 
unilateral spectral density N. The constants a and b are unknown. Find the maximum- 
likelihood estimator of the slope b of the signal on the basis of the receiver input v(t) - 
s(t) + n(0» and calculate the mean-square error of this estimator exactly. Compare 
your result with that given by the Fisher formula. 
6-9. The signal s(t) = a cos fit + b sin XV is received in the presence of white, Gaussian 
noise n{t) of unilateral spectral density N. Find the maximum-likelihood estimators of 
the parameters a and b based on observation of the input v(t) - s{i) + n(t) during 
an interval (0, T). Make no approximations, but assume for simplicity that ClT is an 
integral multiple of 2ir. Calculate the variances and the covariance of the estimators 
a[v{t)} and b[v(t)] of a and b. 

6-10. Given n independent pairs (xt,yk) of correlated Gaussian random variables x and y, 
known to have expected values zero, determine the maximum-likelihood estimators of 
their common variance o 2 = Var.x = Varj> and of their covariance u, = E{xy). The 
joint probability density function of an individual pair {x,y) is 
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/ 1 f o 2 (x 2 +y 2 )-2 i ixy~\ 
PiX > y) - 2«(**-^ eX \ 2(a* - jft } 

/fin/: First find the maximum-likelihood estimators of two elements of the inverse 
covariance matrix of x and y. Then solve for the estimators of cr 2 and fx. 

6-11. Determine the normalization constants in (6-12) and (6-15) and use (6-18) to show that 
for a symmetric block matrix such as that in (6-16) 

detf JJ W 1 = det u, det^ - v*,). 

6-12. A sine wave A cos (0/ + 8) of known amplitude A is received in the presence of white, 
Gaussian noise during an interval of duration T. It is desired to estimate the phase 6, 
which carries information, relative to the phase of a sinusoid B cos fit available at the 
receiver. Take the input as 

v(t)~KeV(t)e iai , V(t) ~ N(t) + A e l \ 

From the Cramer-Rao inequality determine a lower bound on the mean-square error 
of an unbiased estimator 9 of the phase. Work out the maximum-likelihood estimator 
of the phase and show how it might be realized. Calculate the probability density 
function of this maximum-likelihood estimator 6; it depends on the input signal-to- 
noise ratio. Derive an approximate form for this probability density function when that 
signal-to-noise ratio is very large, and from this determine in turn the mean-square 
error in 6 in that limit. Compare with the bound given by the Cramer-Rao inequality. 

6-13. Information is sent as the difference <}> between the phases of two successive narrowband 
signals of known complex envelope F(t), carrier frequency CI, and amplitude A. The 
signals arrive during adjacent intervals, each of duration T, and are corrupted by 
white, Gaussian noise of unilateral spectral density N. Thus the complex envelopes of 
successive inputs Vt(t) and vi(t) to the receiver have the forms 

Vi(t) = AF{t) e'* + N } (t), 0<t<T, 
V 2 (t) - AF(t) e'^ + N z (t), < t < T, 

where the common carrier phase i|/ is unknown and uniformly distributed over (0, 2it), 
and Nt(t) and JV2(*) are statistically independent complex envelopes of the noise. Work 
out the maximum-likelihood estimator of the phase difference <j> in terms of V[(t) and 
V 2 (t). Assuming that the input signal-to-noise ratio is large, calculate the mean-square 
error Var 4> of this estimator 4» of <j>. Make any justifiable approximations that are 
necessary, and the earlier the better. 

6-14. Formulate a Bayesian theory of the combined detection of a signal s(t; a) and the 
estimation of an unknown parameter a on which the signal depends. Under hy- 
pothesis Ho (noise alone present) the joint probability density function of the data 
v = (»i , vz, . , v„) is pq(v); under hypothesis H\ (signal plus noise present) it is pj(v\ a). 
The data v are appropriate samples of the input; at the end, let n — * <». You are to 
design a receiver mat estimates the parameter a each time it decides that a signal is 
present. If it chooses hypothesis Ho, it issues no estimate. 
Define the following symbols: 

£ = prior probability of hypothesis Ho, 
z(ot) = prior probability density function of the parameter a, given that a signal 
is present, 

Coo - cost of choosing hypothesis Ho when Ho is true, 
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Cio(oi) - cost of choosing hypothesis H\ and issuing estimate a of the parameter a 
when Hq is true, 

Coi(a) = cost of choosing hypothesis Ho when H\ is true and the signal parameter 
equals a, 

Cn(a, a) - cost of choosing hypothesis H x and of issuing the estimate a when the 
signal is present with parameter value a. 
Write down the average cost of operation of a strategy for combined detection 
and estimation, and show how the system should be designed to operate with minimum 
average cost. Assume that the input to the receiver, observed during the interval (0, 7'), 
is 

v(t) = n(r), (//<>) 

v(t) =n(t) + s(t;*\ (Hi) 
under the two hypotheses, and express the prescription for receiver operation in terms 
of the likelihood functional A[v(t)\ a] for detecting a signal s(t; a) in the input v(t). 
Talcing the cost function C u (a, a) = g(a - a) 2 , g a positive constant, with 

C,o(a) = Cj , (a, 0), C 0! (a) = C, , (0, a), C 00 < 0, 

express the optimum estimator of the parameter a, under this condition of uncertainty 
as to the presence or absence of the signal, in terms of the optimum estimator of a 
when it is known that the signal is present. (By optimum we mean minimizing average 
cost.) Apply the theory to the detection of a signal Aj\t) of known form and unknown 
amplitude a = A in the presence of white, Gaussian noise of unilateral spectral density 
N. The amplitude A, to be estimated with the quadratic cost function defined above, 
has a Gaussian prior probability density function with expected value and variance a 2 . 
Describe the optimum strategy in terms of matched filtering and whatever subsequent 
processing is necessary. 

6-15. Express 7 and o>, defined as in (6-101) and (6-102), in terms of the amplitude modulation 
M{t) and the phase modulation $(t) of the narrowband signal as defined in (3-2). 
Derive (6-105) through (6-107). Hint: The Fourier transform of F'(t) is iwf(a). 
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Detection of Signals with 
Unknown Parameters 



7,1 UNKNOWN ARRIVAL TIME: THE THRESHOLD DETECTOR 

Radar is used to detect targets that might be anywhere in a range interval much longer 
than the electromagnetic pulses it sends out, and the echo signals may arrive at the 
receiver at any time during the period (0, T) between transmissions. Considering a 
single such interpulse interval, we treat the receiver as a device for deciding between 
two hypotheses on the basis of its input v{t) = Re V{t) exp iClt, 

H : V(t) = N(t\ 

Hi: V{t) ~ N(t) + S(/), S(t) = AF(t - t) < / < T. 

As we saw at the beginning of Sec. 3.3, the receiver can be presumed to have available 
the complex envelope V(t) of its input. Here N(t) is the complex envelope of 
white Gaussian noise of unilateral spectral density N, and F(t) is proportional to 
the complex envelope of the transmitted radar signal. As usual, ft is the angular 
frequency of the carrier of the narrowband signals. The amplitude and phase of 
the received echo are A and respectively, and t is the time at which the radar 
echo arrives; t = 2R/c with R the distance to the target and c the velocity of light. 
We take the complex envelope F(i) to differ significantly from zero only during an 
interval (0, TO much shorter than the observation interval (0, T). 

We assume that the signal is just as likely to arrive at any time during (0, T), 
assigning to the epoch t a uniform prior probability density function 

z{t) s T~\ < t < T. (7-1) 
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The phase ijf is as before uniformly distributed over (0, 2ir), and because the receiver 
does not keep track of the phase of the transmitted pulse, we take vp and t as 
statistically independent. Because the observation interval (0, 7") is so much longer 
than the duration T' of the pulse F{(), we can disregard the possibility that the 
echo may overlap one end or the other of that interval. As the size of the target is 
unknown, the amplitude A conveys no information about how far away it is, and we 
can take A as independent of i]j and t. 

Were the echo amplitude A known a priori, the optimum receiver would de- 
termine the average likelihood functional (3-100), which here becomes 

-2u J,i, rT 



where by (3-54) as written for detection in white noise, with Q(t) = N~ l S(t\ 

A[»(0; A, t] = expjRej^''*^ J V(/ - r)V(t) dt J - ^ j* V(* - t)| 2 rf/ J. 
Upon carrying out the average over the phase this becomes 



WO; A] = I 



rT 

drl 





[i 



rT 



F*(i -T)V{t)dt\ 

r a 2 c r "i (7_2) 

as in (3-80); 7 ( ■ ) is again the modified Bessel function. Because the echo signal is 
unlikely to overlap either end of the interval (0, T), we can take 



\ T \F(i~7)\ 2 dt « r\F(t)\ 2 di 

J0 J-00 



to be constant, and we normalize our signal F(t) so that this integral equals 1. 

A receiver would need to know or assume some value of the amplitude A 
in order to implement the average likelihood functional (7-2). It would pass the 
input v(t) through a filter matched to the signal F{t), which we have taken to differ 
significantly from zero only over an interval (0, T') for V <c T. The complex 
impulse response of this filter is 

jF*<r-, ); os, sr. 
(o, s < o, s > r, 

and the complex envelope of the output v&(t) = Re K (r) exp iVtt of the filter is 
K (r)=f F*{T' -s)V{t - s)ds « f F*{T'-s)V(t - s) ds 

JO J- co 

= f F*{u)V{t - T' + u)du, 

J — CO ' 

as in (3-16). The integral appearing in the argument of the Bessel function in (7-2), 
on the other hand, is approximately 
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F*(t - T)K(r) dt = F*(u)V(u + j)du = V Q (r + T'). 



J— CO J— CO 



Thus the average likelihood functional can be written 




a 2 f°° 1 c T+1 " 



(7-3) 



and the receiver would pass the output Re Vo(t) exp iSU of the matched filter to a 
rectifier having a characteristic proportional to I (AN~ l \ Vq\), whose output would 
in turn be integrated over an interval (T 1 , T + T'). If the integrated output, pro- 
portional to h[v{t); A], exceeded a certain decision level, a signal would be deemed 
to have arrived sometime during the interval (0, T). To evaluate the false-alarm and 
detection probabilities of this receiver would be extremely difficult. 

The threshold detector is based on the assumption that the signals are so weak 
that both the Bessel function and the exponential function in (7-2) and (7-3) can be 
approximated by the first two terms of their power series; all powers of the amplitude 
A higher than the second are neglected. Thus by (3-61) we write (7-3) as 



Multiplying this by an arbitrary prior probability density function z A (A) of the signal 
amplitude A, integrating, and comparing the result with (3-108), we find that the 
term of lowest order in the amplitude is that proportional to A 2 , and we obtain the 
threshold statistic 



The factor A 2 /2N 2 has been absorbed into the decision level. 

The threshold receiver, or weak-signal detector, therefore passes the output 
Re Vo(t) exp iClt of the matched filter through a quadratic rectifier, whose output is 
integrated during the interval (T ! , T + T 1 ), where T is the delay in that filter. If the 
threshold statistic g A exceeds a decision level, set in such a way that the false-alarm 
probability equals a preassigned value, the receiver decides that a signal is present. 
This threshold detector requires no assumption about the actual amplitude A of the 
signal to be detected. It can be thought of as measuring the total energy picked 
up by the antenna during the interval (0, T), and it is sometimes called an energy 
detector or radiometer. 

To compute the false-alarm and detection probabilities exactly for the threshold 
statistic gA is most difficult. We resort to an approximation, replacing it by 





(7-5) 



L 



«i = icEi*i 2 . 



Zk = Vo(t k ). 



(7-6) 
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We are sampling the output of the matched filter at L times uniformly spaced during 
the interval (T 1 , T + T')\ the samples are separated by T/L, We want to choose the 
value of L so that these samples can be considered to be at least approximately inde- 
pendent statistically. The sampling interval T/L should therefore be on the order of 
the width of the complex autocovariance function cj>(s) of the output of the matched 
filter. Indeed, if the signal is strictly bandlimited to a frequency range of width W 
hertz, and L = WT, the samples z k are uncorrected and, being circular-complex 
Gaussian random variables, they are statistically independent. By the sampling theo- 
rem and with a suitable choice of the constant C, therefore, g A ~ g A [Bal57], [Pap91 
pp. 376-9]. 

The complex autocovariance function of the output V G (t) can be read off from 
(6-108) by setting \v\ = w 2 - 0, and using (6-96) and (6-98) we find 

Ms) = \E[V (t)V *( S )\ Ho] = N\(s, 0) 

.- CO 

= A^J J{u - {s)F*(u + i 5 ) ds (7 . 7) 

where as before /(&>) is the Fourier transform of the complex envelope F(t) of the 
signal. 

The random variables = \\zk\ 2 have an exponential distribution under hy- 
pothesis Ho, 

po(r) = ae-< ir U(r), 

and for such a distribution the variance equals the square of the expected value, 

E(r\ H ) = a~\ Var(r| H ) = a 2 = [E(r\ H )}\ (7-8) 

as the reader can easily demonstrate. 

We want to choose the constants C and L so that the threshold statistic g A 
and its approximation g' A have equal expected values and variances under hypothesis 
H . From (7-5), because under hypothesis H the output of the matched filter is a 
stationary process, 

E(g A \ Ho) = \E[\V (t)\ 2 \ H ] = £(r| H ) = 4>(0) = N 

by (7-7), and by (7-6) 

E{g' A \ H ) = LCE(r\ Ho) = LCN, 

whence C = l/L. 

The variance of the approximate statistic g' A under hypothesis H is 

Var(g^| H ) = LC 2 Var(H H ) = LC 2 N 2 = ^ (7-9) 

when, as assumed, the terms in (7-6) are statistically independent. To calculate that 
of the threshold statistic g Ai we write, from (7-5), 

1 rT+T' rT+T' 

E<&a\ h o) = J J E[\V (h)\ 2 \V Q (t 2 )\ 2 \ Ho] dt x dt 2 . 
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We evaluate the expectation inside the integrand by means of (3-44), in which we 
put 

Z\ - Z 3 = Vo{t\), Z 2 = Z4 = V<j(t 2 ). 

The first term on the right side of (3-44), substituted into our double integral, yields 
[E(gA \ Ho)f, leaving the second term to provide the variance, and we find 

Var(gJ Ko) = ^ J J M>«i ~ ^ 2 ^ dt *- 

When, as we assume, T » T 1 and the duration T of the observation interval much 
exceeds the width of the autocovariance function <f>( * ), the integral with respect to 
h can be extended over (-», «>), and we obtain 

Var(g^f^o) = i|_ l<K*)l 2 <&. 

Equating this to (7-9), we find 

L= TJV 2 [J"lWl 2 dj] \ 

By Parseval's theorem and (7-7) this becomes 

L = r[jV<*>)l 4 ^j] 1 = WT t (7-10) 
when we define the effective bandwidth W of the signal as 

„, _ i4»(Q)i 2 _ ft t nu , 
jr.i*wp* rrj/w ft ' 

The numerators have been introduced for the sake of generality, eliminating the need 
for the complex envelope F(t) of the signal to be normalized. 

For a strictly bandlimited signal, the quantity W equals its bandwidth in hertz: 



-ttW < w < ttJV, 

/(«) '. 

|w| > TT^K. 



For a rectangular signal of duration T 1 , W - 1.5/7 1 '. For a Gaussian signal of 
mean-square bandwidth Aw 2 , 

Except in dealing with the ambiguity function X(t, h>), the bandwidth as defined in 
(7-11) is usually more useful than the rms bandwidth Aw. 

Assuming now that the terms in our approximation g' A are roughly independent 
statistically, we see that it is a detection statistic of the same form as U in (4-19), 
and we can use the false-alarm and detection probabilities as calculated in Sec. 4.2, 
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Figure 7-1. Energy -to- noise ratio (dB) for the threshold receiver to attain Q d = 
0.999 versus tiinu-bandwitli product WT. Curves arc indexed with the false-alarm 
probability Qq. 



where Q L (a t (3) is the generalized Marcum Q function as defined in (4-27) (L = M), 
and D is related to the input cnergy-to-noise ratio 




E is the energy of the received signal, and A' is the unilateral spectral density of 
the noise. Remember that we showed in Sec. 4.2.2 that the probability of detection 
depends only on the total energy-to-noise ratio and not on how the signal energy is 
distributed among the terms of the sum U = g' A . The parameter b, related to the 
decision level on the threshold statistic g A and its approximation g' Ai is set to provide 
a preassigned false-alarm probability Qo- 

In Fig. 7-1 we have plotted the energy-to-noise ratio S = E/N (dB) required, 
in this approximation, to attain a probability O d of detection equal to 0.999, for 
various false-alarm probabilities Q Q , as a function of the time-bandwidth product 
L = WT, 10 < L < JO 4 . Lengthening the interval during which the signal might 
arrive by a factor of 1000 increases the required input energy-to-noise ratio by about 
12 dB when the threshold detector is employed. 

The performance of the radiometer in detecting spread-spectrum signals has 
been treated at length in [Dil89]. 
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7.2 UNKNOWN ARRIVAL TIME: 

THE MAXIMUM-LIKELIHOOD DETECTOR 

When we sample the rectified output of the matched filter at intervals At = W~ { , 
as in forming the approximate detection statistic g' A in Sec. 7.1, we in effect reduce 
our detection problem to that treated in Sec. 3.6.4. There the signal could appear 
in one and only one of M independent channels or inputs to the receiver. The 
maximum-likelihood receiver described there decides that a signal is present if the 
filtered and rectified output of any of the channels exceeds a certain decision level. 
The comparison illustrated in Fig. 4-4 — see curves (b) and (c) — shows that a larger 
probability of detection can be achieved by that maximum-likelihood receiver than 
by a receiver that simply adds the quadratically rectified outputs of each channel. 
The latter corresponds to the energy detector analyzed in Sec. 7. 1 . 

In the context of the sampling approximation of Sec. 7.1, this means that a 
decision for hypothesis Hi should be made if any of the samples |Ko(rjc)j 2 passes the 
decision level. Returning to the rectified output |K (Oi 2 as a continuous function of 
time, we require a decision for H\ if | Ko(Ol 2 exceeds a decision level a at any time 
in V < t < T + T'. This is just what an ordinary radar receiver accomplishes. The 
operator sees the rectified output lFo(r)l 2 displayed on the A-scope [Sko62, pp. 6, 
439-40], It fluctuates owing to the random noise. If any fluctuation is large enough 
to cross a threshold mentally applied by the operator, he attributes the excursion, 
or blip, to the presence of an echo signal. The time t m at which the rectified output 
is maximum provides — as we have seen in Chapter 6 — an estimate t of the arrival 
time t of the echo through t = t m - T', and that time t m is close to the instant 
when the rectified output | Vo(t)\ 2 crosses the threshold. If | Vo(t)\ 2 remains below 
the threshold during the entire interval, | K (0l 2 < a,T' < t < T + T', the operator 
concludes that no signal is present. 

In practice the operator sees the output |Fo(Ol 2 during intervals (T\ T + T') 
following each of a large number of transmitted pulses, even when the radar is 
scanning in azimuth. The persistence of a blip at a certain point on the A-scope 
trace will then enhance the likelihood of perceiving a target. The amount by which 
this ability increases the probability Qd of detection is difficult to calculate. We can 
instead imagine the detection process as carried out automatically by a receiver that 
sums, or "integrates," the rectified outputs |K*(0I 2 of the matched filter during a 
succession of M interpulse intervals, 1 < k < M. The crossing of the total output 

M 

r(0 = ][l^(0l 2 (7-12) 

over a decision level a at any time in T' < t < T + T' is then taken as indicating 
the presence of a target. 

This system is diagrammed in Fig. 7-2. We assume that the target is stationary. 
The delay line must retard its input by a time accurately equal to the interval T 
between transmitted pulses. Its output is fed back to its input, where it is added 
to the output of the quadratic rectifier. During the final observation interval the 
alarm circuit is activated, and if the output of the adder crosses the decision level 
built into the alarm, it signals the presence of an echo. At the same time it can turn 
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Figure 7-2. Receiver for signals of unknown arrival time. 



on another circuit, not shown, to measure the time t m - t + V at which the total 
output reaches its peak value and thus estimate the range of the target. 

The false-alarm and detection probabilities for a receiver that makes its deci- 
sions in this manner are 

Qo = 1 - Pr[r(0 < a, T' < t < T + T'\ H ], 
Q d = 1 - Pr[r(0 <a,T' <! <T + T'\ Hi], 

where for a single interval of observation *•(/) = ! Vo(t)\ 2 ; for a number M of intervals 
r{t) is given by (7-12). These probabilities are most difficult to calculate exactly. In 
the next section we shall approximate the false-alarm probability Q by the expected 
number of times the random process r(r) crosses above the decision level a, an 
approximation that is valid when— as usually— Q <c 1. In Sec. 7.4 we shall argue 
that at input energy-to-noise ratios large enough for useful probabilities of detection, 
Qd can be approximated by the probability of detection of a signal whose time t of 
arrival is precisely known. 



7.3 THE FALSE-ALARM RATE 

7.3. 1 The First-passage Time Problem 

The maximum-likelihood receiver for detecting a signal of unknown arrival time in 
Gaussian noise must determine whether a certain stochastic process crosses a fixed 
level during the observation interval. For a receiver working with a single input v(t), 
this process is the rectified output of a filter matched to the expected signal. When 
several inputs with independent noise components are available, the process is the 
sum of a number M of such rectified outputs. The false-alarm probability is the 
probability that the stochastic process crosses the decision level during the interval 
of observation when the receiver input contains only noise. Now we shall consider 
the problem of calculating this probability. 
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The stochastic process of interest will be denoted by r{t); it is stationary. The 
probability Qo that r(t) will exceed a level r = a sometime during an observation 
interval < t < T is given by 

Qo = I ~ Po(T), P (t) = Pr[r(?') < a, < /' < t]. 

The function Po(t) is the probability that a process r(t') drawn at random from its 
ensemble lies below the level r - a throughout the interval (0, t). From the value 
Pq(0) = Pt[r(Q) < a] at t = this function P (t) decreases to as the time t goes 
to infinity. 

The negative derivative q(t) ~ -dPjdt, < t < oo, is the first-passage-time 
probability density function; q(t) dt is the probability that the process r(t) crosses the 
level r - a from below for the first time in the interval (/, t + dt). Calculating the 
density function q(t) is essentially the same problem as finding the probability Po(t), 
and it has been solved for only a few types of stochastic process r\t). 

Early work on first-passage-time probabilities was summarized by Siegert 
[Sie51] in a paper in which he presented a general solution for stochastic processes 
of the type known as Markov processes. For a Markov process r(t) the conditional 
probability density function of r at time t, given the values = r(tk) of the process 
at an arbitrary set of m previous times t m < t m -\ < ■•■ < t\ < t, is a function only 
of r and r\ = r(t\): 

p(r, t\ r h h; r 2 , t 2 ; ... ; r m , t m ) = p(r, t\ r Xi t\) 

[Hel91, pp. 444-8], [Pap91, pp. 635-7]. The function p{r, t\ n, ti) is called the 
transition probability density function of the process. The probability density function 
of the value r(t) of a Markov process at any time t in the future depends only on its 
probability density function at the present, and not on the past history of the process. 
Siegert showed how the Laplace transform of the probability density function q{t) 
could be written in terms of the Laplace transform of the transition probability 
density function of the process. 

The stationary Gaussian Markov process, for example, has an autocovariance 
function <Kt) = <K0) exp(-(i|T|), and the Laplace transform of its first-passage-time 
probability function is a quotient of Weber-Hermite functions. When the interval is 
so long that \lT » 1, the probability density function q(t) is governed mainly by 
the pole of this Laplace transform lying nearest the origin; and if the level r = a is 
much higher than the rms value [<H0)j i/2 of the process, the probability Qo that it 
will be crossed at least once during an interval (0, T) is approximately 

e » Kl - e "' ^ = Jk exp [-^°i * T>>1 ° 2>>m ' (7 - I4) 

where T| is the reciprocal of the expected value of the first-passage time and has been 
calculated by Siegert's formulas. 

A second Markov process lending itself to this kind of analysis is the squared 
envelope 

r(t) = [x(t)] 2 + \y{t)f 

of a narrowband Gaussian process whose complex autocovariance function is the 
exponential function $(t) = <j>(0) exp(-[i|T|). This process, whose quadrature com- 
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ponents x{t) and y{t) are independent Gaussian Markov processes, has been treated 
by Rice [Ric58], Tikhonov [Tik61], and this writer [HeI59]. The Laplace transform 
of its first-passage-time probability density function can be expressed as a quotient of 
confluent hypergeometric functions. When jx7 » 1 and a » 4>(0), tiie probability 
Qo that the level r = a will be crossed at least once in (0, 7") is asymptotically 

e ^'-^'' ^W) exp bm] * T>>1 - a>>m 

When the expected value of the process r(r) is and a = 0, the problem of 
calculating the probability density function q(t) or the distribution P Q (t) is known 
as the zero-crossing problem. Problems of this type have been extensively studied 
by Longuet^Higgens [Lon62], McFadden [McF62j, Rainal [Rai62], [Rai87], Slepian 
[Sle62], and others. Besides the first-passage-time probability density function, they 
have investigated the distribution of the number of times /■(/) crosses the level r = 
in a given interval and the distributions of the lengths of the intervals between such 
crossings. 

7.3.2 The Crossing-rate Approximation 

in a radar receiver that is to detect signals of unknown arrival time the false-alarm 
probability must be kept much smaller than 1, simply because the user cannot afford 
to let it be large. In a defensive system based on radar, for instance, it is so costly 
to send missiles to attack apparent targets that few sorties can be permitted. It can 
therefore be assumed that the level a is so much larger than the rms value of the 
process r(i) that there is only a small probability that the output r(t) of the receiver 
will exceed it at any time during the observation interval (0, T). In addition, the 
interval (0, T) is much longer than the correlation time of the stochastic process 
/■(/), which is on the order of the reciprocal of the bandwidth of the signal. Over 
most of the interval (0, 7") the process r(/) has negligible correlation with its initial 
value r(0) and its initial time derivatives, and the probability <2o = 1 - Pq(T) will 
be almost independent of them as well. Under these conditions it is useful to define 
an average rate r\{a) with which the stochastic process r(t) crosses the level r ~ a 
from below. For a stationary process this rate r\ is constant, and in an interval of 
length T the average number of crossings is t\T. In radar r\ is called the false-alarm 
rate. 

Let P„(T) be the probability that the rectified process r(t) crosses the decision 
level a n times during (0, T). Then the average number of crossings is 

t]T = P X (T) + 2P 2 (T) + 3P 3 (T) + 

and the false-alarm probability is 

Q» = P\(T) + Pi(T) + Pi(T) + - 
= i\T ~ P 2 (T) - 2P 3 (T) . 

Under the assumption that the decision level a is so high that the probabilities P„(T), 
n > 2, of two or more crossings in (0, T) are negligible, the false-alarm probability 
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is approximately equal to the product of the false-alarm rate tj and the duration T 
of the observation interval, 

Qo * i\T. (7-16) 

Cramer [Cra66] showed that if r(t) is a Gaussian random process, then in the limit 
a 2 » Var r(t) the probabilities P n (T) have approximately the Poisson form 

Pn(T) « e-* T , 
n\ 

and again 

Qo « 1 - e~* T * T[7\ TJ7 1 1. 
We now turn to the calculation of the false-alarm rate i). 

7.3.3 The Crossing Hate of a Stochastic Process 

The false-alarm rate, or the expected number of crossings of r - a per second, is 
given by Rice's formula 

T,(a)= rr'p(a,r')dr', (7-17) 
Jo 

where r' - dr/dt is the rate of change of the stochastic process r(t), which must be 
differentiable at least once [Ric44, eq. 3.3-5], (Primes indicate differentiation with 
respect to the time /.) The joint probability density function of the rectified output 
r and its rate r' of change is p(r, r'). The history of this result has been narrated 
by Rainal [Rai88]. 

To derive (7-17) we utilize the counting functional 

g(t) = r'(08(r(0 - a) 

introduced by Middleton [Mid60a, p. 426]. If the process r(/) crosses the level r = a 
in the brief interval (/ - e, t + e) and is then increasing, 

J-t+e rt+e rr(t+e) 

git) dt = r'(t)h(r(t) ~a)dt = 8(r -d)dr - 1, r 1 > 0. 

If it crosses r = a while decreasing, 

g(t)dt = Z(r-a)dr=-\, r' < 0, 

i-e Jr(t-e) 



t 



because then r(t + e) < r(t - e). If r(t) does not cross r - a during it - s, t + e) 
at all, 

-f+E 

g(t)dt =0, 

ft-E 

for then the delta function 6(r - a) stands outside the interval (r(t - e), r(t + e)). 
Introducing a factor U (/•'), with U(-) the unit step function, into the counting 
functional eliminates the downward crossings from the count. If we break up the 
interval (0, T) into subintervals (/ - e, t + e), we see that the total number of times 
that the process r(t) crosses r = a in an upward direction during (0, T) must be the 
random variable 
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Jo 



The average number of upward crossings is obtained by multiplying this by the 
joint probability density function p(?\ r'\ t) of the random variables r and /■' at time 
( and integrating; 



E[N+(T)] = | 7 di 
h 

■T 



dr 



dr'p(r,r'; t)r'U(r')§(r - a) 



dt 



r'p(a, /■'; I) dr 1 . 



Because the process r(t) is stationary, />(>-, r'\ i) - p(r, r'), and 

E[N+(T)} - t)7\ 

where tj is the constant false-alarm rate given in (7-17). 

For a stationary Gaussian random process r{t) with expected value zero and 
autocovariance function c|>(t), the derivative r'(t) has zero expected value and vari- 



ance 



Var /■' = — 



d^ 



-=o 



= I4>"(0)|, 



and /■'(/). is independent of r{!) [Hel91, p. 412], [Pap9I, p. 314]. Hence their joint 
density function is 



1 



2wV*(0)l4)"(0)| 
and from (7-17) the crossing rate is 

1 



exp - 



2*(0) 



2|<|>"(0)| J' 



Tl ~ 



2-rr 



r H>"(Q)l 1 
L 4>(0) J 



1/2 



exp 



2(M0) 



It is necessary that |<t>"(0)[ be finite, 

l4>"(0)l - to 2 0>(co) ^ < oo, 

J- co 2n 

where <t>(o>) is the spectral density of /■(/). The Gauss-Markov process mentioned 
at the. beginning of Sec. 7.3.1 has !<f)"(0)j = oo, but for such a process we can use 
(7-14) to approximate the false-alarm probability when r\T «: 1. 
When the level a is high, a 1 » <J>(0), 

by (2-135). Then we can write the crossing rate as approximately 

2^f 2 p 



exp 



~ 1 

2(f)(0) J 



where 



L*co> J t* Mr>al 



(7-18) 
(7-19) 
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is the mean-square bandwidth of the process r{t). The first factor on the right of 
(7-18) is ordinarily on the order of 1. We can interpret the crossing rate T| as roughly 
equal to the number 0/2tt of approximately independent samples of r(t) per second, 
multiplied by the probability Pr(f > a) that any one of those samples exceeds the 
level a. 



7.3,4 The Crossing Rate of the Rectified Process 



In the maximum-likelihood receiver of a signal of unknown arrival time, as described 
in Sec. 7.2, the process r(t) is the output of a quadratic rectifier following a filter 
matched to the signal to be detected: 

r{t) = \V {t)\ 2 = [x(t)} 2 + \y{t)f, = \z{t)\\ (7-20) 

where Vo(t) = z{t) - x(t) + iy(t) is a circular complex Gaussian random process 
with expected value zero under hypothesis H Q . We introduce the notation z(t) for 
convenience. The derivative of this rectified process is 

r'{t) = 2(xx' + yy') = 2 Re[z *(/)*'(/)], (7-21) 

where 

At) = ~ = *'(f) + iy\t) 

is the derivative of the complex envelope z(t). 

We need the joint probability density function of r and r', and we obtain it 
by way of the conditional probability density function p{r'\ z). When the complex 
variable z is fixed, so are x, y, and r. (Unless otherwise noted, all these are samples 
of random processes taken at the same time t.) Thereupon r' in (7-21) is a linear 
combination of Gaussian random variables jc' and y', and 

p(r'\z)=p{r'\x t y) 

must be a Gaussian density function, determined entirely by the conditional expected 
value E(r'\ x, y) and the conditional variance Var(r ; j x, y), which we now calculate. 

First we set up the joint circular Gaussian probability density function for the 
real and imaginary parts of 2 and z 1 - dzfdt. The complex autocovariance function 
of the stationary process z(t) is, as in (3-32), 

<K'2-fi) = \E[z{h)z*{t 2 )]. 

Hence 

\E[z'{t x )z\t 2 )] = — <K' 2 ~ h\ (7-22) 
and setting t\ ~ t 2 = t, we obtain 



i£ (2V ) = 



which is purely imaginary. Likewise, 

\E{zz'*) = 



T=0 
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Differentiating (7-22) with respect to h and setting t\ = h- t, 



T=0 



(7-23) 



In terms of the narrowband spectral density <X>(w) of the circular-complex pro- 
cess z{t), defined as in (3-19), the complex autocovariance function is 



<Kt) = 2 

and differentiating twice with respect to t, we find 

r 00 

<j)J = 2 iw<J>(w) 

J- 00 

r 00 

W=2 w 2 <I>(co) 

J~co 



2^' 



f/co 
2tt' 

2ir' 



By appropriately choosing the carrier frequency ft, we can make <t>o equal to zero, 
and we assume that that has been done. Then the complex covariance matrix of z 
and z' is diagonal, and by (3-40) their joint circular Gaussian density function is 



p(z } z') = p(x,y t x\y') = 

and A', y, x', and y' are independent. 
By dividing by 

m = 



1 



UI2 



U/|2 



2(j>o 21441 



1 



2TT<t>c 



exp 



2<i>o 



(7-24) 



we find that the conditional probability density function of .v' and y', given x and 
y, is 

] f u/|2 



2^f^'! eXp 2«| 



in 

1 



The conditional expected value of the derivative z' is therefore zero, and by (7-21) 
that of r 1 is also zero. By (7-21), (7-20), and (7-23), 



Var(r'l z) = Var(r'| r) = 4(a 2 Var A'' + y 2 Vary') 
= 4(A- 2 + 7 2 )|^|=4r|<f>a 



(7-25) 



Thus the conditional density function of the conditionally Gaussian random 
variable r', given a and and hence r, must be 



Pir'\ r) = 



1 



8irr j<j>{ 



Now with >- = |s p we find from the circular Gaussian density function in (7-24) that 



p( r ) - exp 
2d>o 



"2^ 



V{r) t 



(7-26) 
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and the joint probability density function of r and r' is therefore 



p(r, r') = =L= exp] -— - | U{r). 

24>oV8w-l4>ol 



1 2<f> ~8r|<tf|J 



Substituting this into Rice's crossing-rate formula (7-17) and integrating, we find 
that the crossing rate for this quadratically rectified process is 

with 

o2 _ l<t>oi 
P 4lT 2 <f>o' 

In terms of the narrowband spectral density 0(<o) of the process Vo(t), defined as in 
(3-19), the parameter (3 2 is 

,2 - C(g) 2 &(<») ft 

with the mean frequency <5 of Fo(') measured from an arbitrary carrier frequency 
ft. Thus (3 is the rms bandwidth (in hertz) of the process Vo(t) at the output of the 
narrowband matched filter. Because that filter is matched to the signal F(t), 

*(») = f l/(«)l 2 . 

where as before f((a) is the Fourier transform of the complex envelope F(t) of the 
signal. Thus = Ag>/2it, where Ao> 2 is the mean-square bandwidth (6-91). For 
a signal F(t) bandlimited to —TtW < to < ttW, = W/yf\2; and for a Gaussian 
signal, 

If the signal f(/) is rectangular, on the other hand, is infinite; but a radar sig- 
nal cannot rise infinitely rapidly, and the crossing rate t\ is inevitably finite. The 
bandwidth parameter (3 is also infinite for a process r(t) = [x(t)f + \y(t)] 2 when 
x(t) and y(t) are independent Gaussian Markov processes, but for such a rectified 
process we can use (7-15). 

We can express the crossing rate t\ in (7-27) much as in (7-18), 

<n = [^] 1/2 pPr(|Fo(/)| 2 >a), 

and as the first factor is on the order of 1, the rate is nearly the number of effectively 
independent samples per second times the probability that any one sample of | Vo(t)\ 2 
exceeds the level a. 

If M inputs are filtered, rectified, delayed, and superimposed, as in (7-12), the 
random process involved is 

M 

r(t) = X MOI 2 . (7-28) 
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with z k (t) - V k (t) - .x k (t) + iy k (t) the complex envelope of. the output of the 
matched filter during the kth interval. Then as in (7-21) 



M 



r'(0 = 2RcX-*(04(0. 

By fixing all the z k % in our condition, we can carry through the same analysis, and 
(7-25) becomes 

M 

Var(r'! {z k }) = 4 £ (** + y})Wi\ - 4r|^| 

as before. The only change in our result is to replace p(r) in (7-26) by the gamma 
density function for M degrees of freedom, 

p{r) = ^Jr^lkT' ap {-^) uir) - (7 " 29) 

whereupon the false-alarm rate is 

and the false-alarm probability is approximately Q = -nr. 



7.4 UNKNOWN ARRIVAL TIME: 

THE PROBABILITY OF DETECTION 

The probability of detection in a maximum-likelihood receiver, expressed in (7-13), 
is even more difficult to calculate precisely than the false-alarm probability, for the 
random process r(t) is no longer stationary; the signal makes its statistical prop- 
erties, such as its expected value and variance, into functions of time. At large 
energy-to-noise ratios, however, as we shall now argue, it is a good approximation 
to set the probability Q c/ of detection equal to what it would be if the arrival time t 
of the signal were known. In effect we equate the probability that the peak value of 
the process r(t) exceeds the decision level r = a with the probability that a sample 
of r(t) taken at the proper instant for detecting a signal of known arrival time will 
exceed the same level. 

The signal component of the rectified, delayed, and summed outputs, that is, 
of r(/), peaks at the time / = t + V, where t is the exact arrival time of the 
signal and T' the delay in the matched filter. When the energy-to-noise ratio is 
large, this time is close to the time t m = t + T' when the sum of signal and noise 
is maximum. The difference /,„ - t is the error in estimating the arrival time of 
the radar echo, and as we have seen in Sec. 6.3, its rms value is much less than the 
reciprocal bandwidth Aw" 1 of the signal when the energy-to-noise ratio S = E/N 
is large. Because of the correlation imposed on the noise by the matched filter, the 
sum of signal and noise can change only slightly between times / and t m . Then the 
probability 

Pr[max r(t) > a, T' < t < T + T'\ H(\ = Pr[r(U > a\ H } ] 
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that the peak of the output r(t) exceeds the decision level a when H\ is true must, 
for S » 1, be nearly the same as the probability 

Vt[r{h)> a\H x \. 

This, however, is just the probability of detecting the signal when its arrival time to 
is known, as calculated in Sec. 4.2.2. The detection probability is therefore to good 
approximation 

Q d « Q M (S, y) = (~J <r^/ W -,(2>/3F) dr, y = (7-31) 

M being the number of delayed and rectified outputs of the matched filter that 
make up the final output r(t), [Here we use the form of the generalized Marcum 
^-function in (C-19) for simplicity of writing.] 
The false-alarm probability is 

& = T^T-^ yfiwy^QT) e-y, y = ^~, (7-32) 
(M - 1)! 2$o 

from (7-30). For a signal bandlimited to W hertz, as we saw in Sec. 7.3, p = W/-JT2, 
and we find 

Qo = l - My M ~ l L e->, L = WT. 

* (M - 1)! V 3 

In Fig. 7-3 we have plotted versus L = WT the energy-to-noise ratio S = E/N 
(dB) required to yield a detection probability Qj - 0.999 for such a bandlimited sig- 
nal with M = 1 and the same false-alarm probabilities as in Fig. 7-1 . The maximum- 
likelihood receiver is seen to be much less sensitive to uncertainty in the arrival time 
t than the threshold receiver of Sec. 7.1. Lengthening the observation interval by 
a factor of 1000 requires an increase in the energy-to-noise ratio S by only 0.7 to 
1.3 dB in order to maintain the same reliability (Qo, Qd)- The factor (37 1 affects 
the decision level through (7-32) only logarithmically, and the resulting slow rise 
in the decision level a with increasing $T entails only a slow rise in the requisite 
energy-to-noise ratio. 



7.5 SIGNALS Of UNKNOWN ARRIVAL TIME AND 
CARRIER FREQUENCY 



If a radar target is moving, the carrier frequency of the echo is displaced from that 
of the transmitted radar pulse by the Doppler shift. As shown in Appendix F, 
if the target is moving away from the transmitter and receiver with a velocity v, 
the carrier frequency is altered upon reflection by w = — (2v/c)Cio, where c is the 
velocity of light and fto the carrier frequency of the transmitted pulse; \v\ <K c. The 
complex envelope of the signal is slightly expanded or contracted, depending on the 
direction of relative motion, but by a negligible amount when as usual \v | <K c. The 
receiver must anticipate signals having a carrier frequency ft lying anywhere in a 
band fto - \ W d < ft < fto + \ W d , where \ W d equals (2CIq/c) times the maximum 
velocity attainable by a target. 
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Figure 7-3. Energy-to-noise ratio (dB) lor Q,i - 0.999, maximum-likelihood 
detector, versus lime-bandwidth product. Curves are indexed with the false-alarm 
probability g . 

A typical echo signal will have the form 

s(t; A, t, i!') = A Re F{t - t) exp[/(fl + w)t + >'H 

where t is the arrival time of the envelope, w the Doppler shift, A the amplitude, and 
iji the phase. The likelihood functional for detecting this signal in white Gaussian 
noise is 

e ) F x (t -i)er' w, V{l)dt\ 



A 

A[v(i); A, v|f, t, w] = exp-| — Re 



N 



A 2 C T 

U,""-" e 



as in (6-90), where V{t) is the complex envelope of the input 

v(t) = Re V(t) exp /n r, 

referred to the carrier frequency H of the transmitter. Again the phase i|y can be 
taken as uniformly distributed over (0, 2n), and we assume that the observation 
interval (0, T) is much longer than the duration V of the signal. 

In the ensemble of signals the arrival time t and the Doppler shift tr can be 
taken as statistically independent, and if their prior probability density functions are 
£i(t) and z 2 (w), the average likelihood functional for detecting a signal of amplitude 
A will be 
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A[v(t)} = j o z,(t) d7 j ^ ^z 2 {w) dw exp \F(t - t)P dt 

• /0^||V(/--T)^^)^jj, 



much as in (7-2). Again we can put 



f \F(t-r)\ 2 dt* f \F(t)\ 2 dt = h 

JO J- co 



and again we can take the prior density function zj(t) of the arrival time as the 
uniform one in (7-1). 

If the input v(t) is passed through a filter matched to a signal with Doppler 
shift w, whose complex impulse response is 

\T' -s)e~ iwiT '- s \ 0<s < r, 



K(s; w) = , 

5 < 0, 5 > r , 



(7-34) 



fr 

u 

as in (6-93), the complex envelope Vo(t; w) of the output will be 

*oC* ; w) = P e" /ww V(t - T' + u) </w (7-33) 

J- CO 

as in (3-16), and we can write the average likelihood functional as 

f T+T> rWd/2 r a n 

' h> dr \ w /2 Z2iw)I °[ N 1 Vo(t ' W)l J ^ 

To implement this functional, even after picking some standard amplitude A and 
making a reasonable assumption about the prior density function z 2 (w), would be 
most difficult, not to speak of trying to evaluate the performance of the resulting 
receiver. 

Because the modified Bessel function ib( ■ ) is a steeply rising function of its ar- 
gument, the main contribution to the integral in (7-34) comes from the neighborhood 
of the maximum value of | V Q {t; w)\, provided, as we assume, that the prior density 
function z 2 (w) varies smoothly over the frequency interval (-\ Wj < w <\ Wj) and 
W&V » I. If that maximum value |PbO w ; w m )\ is large, A[v(t)] will be approxi- 
mately proportional to 

/o[^o(' m ;^)l] 

and we can expect A[v(t)] to be large when \ Vo(t m ; w m )\ is large. A good approxima- 
tion to the optimum receiver will again be the maximum-likelihood receiver, which 
decides that a signal is present (hypothesis H\) if the maximum value of the function 
\Vo(t; w)P exceeds a decision level a, that is, if |Vb(j; w)\ 2 crosses a at some time t 
in (7", T + T') and for some value or values of w. 

It is impossible to examine | Vo(t ; w)\ 2 for a continuum of values of w in (~j Wd, 
5 IVfi). Instead, just as in Sec. 6.3.2, one constructs a bank of narrowband filters in 
parallel, each matched to a signal 
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Re F(t) exp i(Oo + < i < T', 



for one of a finite set of values of w uniformly spaced over the range (-± W ({ , \ W<j\ 
and one observes all their rectified outputs during the interval (V, f + V). By 
taking their spacing hw in frequency small enough so that their complex transfer 
functions /*{<o - w) substantially overlap, one loses little of the information con- 
tained in \Vo(t; w)\ as a continuous function of w. [Here again /(w) is the spectrum 
of the complex envelope of the signal F(t).] 

In Sees. 6.3.2 and 6.3.3, we examined the outputs of just such a bank of 
matched filters. There we showed that the signal component of the quadratically 
rectified output of the filter K w (s) matched to a signal with Doppler shift w is 

RsU; w) = A 2 \\(t - /, w - wo)] 2 , t = to + 7", 

where X.(t, w) is the complex ambiguity function defined in (6-96); K(0, 0) = I. This 
output is maximum in the filter tuned for iv = wq and at time t Q ~ to + T'. The 
noise will displace the time at which the peak output occurs and may cause that 
largest output to take place in a neighboring filter. 

The noise components of the filter outputs are, however, correlated. As in 
(6-108), the complex cross-covariance function of the complex envelopes of the out- 
puts of filters tuned for frequency shifts w\ and w 2 is 

4>(JJ n'j, w 2 ) = \E[V Q {t- w t )V Q *(t + j; w 2 )\ Ho] 

= N exp[-|/(»V| + m'2)j]\(j, w\ - 1V2). 

The width of this function in the frequency direction is on the order of where 
again A? 2 is the mean-square duration of the signal. The rectified outputs of fil- 
ters having pass frequencies separated by hw « ir' will, therefore be significantly 
correlated, and the signal components of those outputs will differ only by a little. 

If the rectified output of any of these filters exceeds a decision level a, the 
receiver decides that a signal is present. The frequency shift w m associated with the 
filter having the largest rectified output serves as an estimate of the Doppler shift w 
of the echo and hence provides an estimate of the component v of the velocity of the 
target in the direction of the receiver. The time t m at which the peak output occurs 
furnishes as before an estimate of the distance of the target through t = t n , - T 1 . 

We can derive a crude approximation to the false-alarm probability by the 
following reasoning. The factor in (7-32) can be thought of as the number of 
independent opportunities for the rectified output /•{/) of the matched filter to cross 
the decision level a when under hypothesis Ho it consists of only noise. Looking 
across the bank of matched filters in the frequency direction, we see outputs that are 
approximately uncorrelated when separated in angular frequency by roughly Ar 1 . 
The total number of approximately independent outputs in the range of frequencies 
spanned by the filter bank is about WjAt. The overall false-alarm rate can therefore 
be approximated by W4U times that determined in Sec. 7.3, and for a single input, 
M = 1, by (7-32), 

O „ ^KprXH-rfAO e" J ", y = ~, (7-35) 
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provided WjAt » 1. Here as in Sec. 7.4 y is proportional to the decision level a. 
If Wakt < I, the outputs of all the filters will fluctuate more or less together, and 
in (7-35) we must replace the factor %AJ by 1. 

By an argument similar to that in Sec. 7.4, the probability Q</ of detecting an 
echo signal with large enough input energy-to-noise ratio 5* will again be roughly 
the same as though its arrival time and Doppler shift were known a priori, 

in terms of Marcum's Q function (3-76). The difference w„ - wq represents the error 
in an estimate of the carrier frequency, and we saw in Sec. 6.3.4 that the rms error 
will be a fraction of Ar 1 determined by the signal-to-noise ratio d, and d » 1. The 
difference t m — to represents the error in an estimate of the arrival time, and this will 
be a fraction of Aw" 1 . If the largest peak output \ Vo(t m ; Wm)\ 2 exceeds the decision 
level therefore, it is nearly certain that the output [ Fb(*o; *«o)l 2 will also exceed 
that decision level. The latter, however, is the decision statistic we should use if we 
knew the true arrival time and Doppler shift of the signal. 

If a number M of successive inputs to the receiver are filtered, rectified, de- 
layed, and summed as in Fig. 7.2, the delay line associated with a given filter must 
introduce a delay that is compensated for the motion of the target between trans- 
mitted pulses. The delay corresponding to a target receding with velocity v must 
equal (1 + 2v/c)T = (1 - w/CIq)T, where w is the resulting Doppler shift. The 
probability of detection can then as before be approximated by (7-31). 

The additional factor WjAt affects the decision level a only logarithmically, 
and as with (37\ the energy-to-noise ratio required to attain a particular probabil- 
ity Qj of detection will increase only slowly with an increasing width W4 of the 
frequency band within which the carrier frequency Oo + w of the radar echo is 
uncertain. 

When in Chapter 10 we treat the resolution of signals, we shall study the 
properties of the ambiguity function X(t, w) in detail, and we shall consider further 
how the detection of signals of unknown arrival time and Doppler shift and the 
estimation of those parameters depend on its form. 

7.6 THE LEAST FAVORABLE DISTRIBUTION 

7.6.1 The Extended Neyman-Pearson Criterion 

In this chapter we have thus far assumed that the unknown arrival time t and 
the unknown frequency shift w of the radar echo s(t; t, w) are a priori uniformly 
distributed over their respective ranges (0, 7') and (-3 Wd> \ Wj). Ordinarily there is 
no reason to adopt any other prior distributions for such parameters. If the noise 
were not white, however, but possessed some nonuniform spectral density, or if the 
noise level varied in some known way during the observation interval, it might be 
unclear what kind of prior density functions z\(r), Z2(w) would be most reasonable, 
absent any reliable past experience of the relative frequencies with which various 
values of the signal parameters t and if occur. 
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In order to treat this problem in a general way, we return to the consideration 
of the extended Neyman-Pearson criterion that we began in Sec. 3.6.2. Let the 
signal s(t; 8) depend on m unknown parameters 8 = <8|, ... , 0,„), which lie in a 
parameter space ©. As before, we suppose the input »(/) to the receiver to have been 
appropriately sampled to produce n data v - (v u ... , v„) upon which the decision 
about the presence or the absence of the signal will be based. Under hypothesis //(>— 
noise alone is present— these are governed by a joint probability density function 
Pi>(v). When under hypothesis H\ the signal s(t; 8) is present, their joint probability 
density function is p\(v\ 0). 

Given the prior probability density function z(8) of the unknown parameters, 



r(8)rf w e = 1, (7-36) 



the extended Neyman-Pearson criterion calls for basing the decision between hy- 
potheses Hq and H\ on the average likelihood ratio 

where 

p\{v) = I z(9) P] (v\e)d'"Q (7-37) 

is the overall probability density function of the data under hypothesis H\. If A(v\ z) 
exceeds a certain decision level A , hypothesis N { is selected. The decision surface 
D[z] on which A(v; z) s A divides the space R„ of the data into two regions Rq[z] 
and R][z], and 

A(o; s) < A ( », v G /? fc]; A(d; r) > A , B e R]{z]. 

The value of the decision level A is chosen to induce a prcassigned value of the 
false-alarm probability 



p\{v\$)d"v 7 (7-39) 



Qo = /j„(tO </V (7-38) 
The probability of correctly detecting the signal s(t; 8) with this receiver is 

and the average probability of detection is 

. QdW = | =(e)fi,/(0; 3) rf»e. (7-40) 

Under the extended Neyman-Pearson criterion, as we saw in Sec. 3.6.2, this average 
detection probability is maximum among all decision strategies satisfying (7-38). 

When the prior probability density function s(8) is unknown, the most con- 
servative course would seem to be to take it as that density function 5(8) for which 
Q d [z] is smallest. As we shall see, with whatever set 8 of parameters the signal ar- 
rives, the probability Q t/ (Q; z) of detecting it with the receiver designed for the least 
favorable distribution = (8) will never be less than the average detection probability 
Q ( ,[z], and a fortiori for any actual prior distribution z(&) of the parameters 



Sec. 7.6 The Least Favorable Distribution 



287 



f z(0)&(0;z)<re> Q d [z}. 
J® 

If, on the other hand, the receiver were designed for some other prior probability 
density function Z](0), and the true prior probability densityfunction z(9) differed 
from it, the average detection probability might be less than Q d [z]. We now set out 
to determine criteria by which the least favorable prior probability density function 
2(6) can be identified. 

When, as often, the amplitude A > of the signal, perhaps among other 
parameters, is unknown a priori, its least favorable distribution z(A) will obviously be 
b(A); the probability Q<t{A) of detecting a signal of amplitude A is always least when 
A = 0. The amplitude A thus plays a peculiar role among the signal parameters, and 
it must be treated separately. In what follows we assume that either the amplitude A 
of the signal is known, or the receiver is designed to be optimum only for signals of 
some standard amplitude, or the probability density function p\(v\ A, 6) has already 
been averaged with respect to some prior density function z(A) of the amplitude. 
We turn to the search for a criterion by which the least favorable distribution z(0) 
of parameters other than A can be recognized. 

For simplicity let us assume at first that the parameter space contains only 
M discrete points 6i, 02, ... , 6a/- In detection, for instance, it may be that under 
hypothesis H\ one and only one of M possible signals sj(t) = s(t; 0,-) can be present 
in a given trial; the observer is not concerned which it is. Let the prior probabilities 
of these parameter values or signals be z\, Z2, ... , z^, with 

z, + z 2 + - + z M = I. (7-41) 

The set of probabilities z = (zi, z 2 , ... , zm) specifies the relative frequencies of the 
various parameter values 0y when hypothesis H\ is true, that is, when some signal 
is present. These probabilities can be represented as a point in an M-dimensional 
Cartesian space, and (7-41) requires the point to lie on a particular hyperplane in 
that space. 

The joint probability density function of the observations v when the y'th signal 
is present is abbreviated as 

P\(v\Qj) = pjiv); 

the joint density function of the data v under hypothesis H is still po(v). The average 
probability of detection is then 

QAA = ld PM d n v = X ZjQj[z], (7-42) 

where 

QAA = Pj{v)d"v 

is the probability of detecting the y'th signal with the Neyman-Pearson strategy for 
prior distribution z. Here R\[z] is again the region of the space R„ of observations 
in which H\ is chosen. The Neyman-Pearson criterion directs us to maximize the 
average probability Q (! [z] for a fixed false-alarm probability given by (7-38). The 
regions R [z] and R][z] are separated by the decision surface D[z], whose position 
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we must determine. In particular, we seek a set of prior probabilities z and a decision 
surface D[z] such that the maximum value of Q d [z] is as small as possible. 
The prior probabilities z are restricted to the range (0, 1), 

0<zj < 1, j = 1,2, ...,M; (7-43) 

the point z lies in the unit hypercube of the A/-dimensional space, a requirement 
that further constrains the variational problem at hand. With each point z in the 
intersection of this unit hypercube with the hyperplane of (7-41), there is associated 
a maximum value of Q d [z] for a given value of Q<y. That maximum value is attained 
by a receiver that forms the average likelihood ratio 

_ M 

Mv) = ^zjAj(v), (7-44) 

where 

A» = ^, \<j<M, 

and compares it with a decision level A , choosing hypothesis Hi when A(z>) > A 
and hypothesis H Q otherwise. The decision level A is set by the requirement that 
the false-alarm probability equal the preassigned value. This strategy is the same as 
that described in the beginning, except that we have only a finite number instead of 
a continuum of possible signals. 

Henceforth we define Q d [z] as the average probability of detection when this 
strategy, optimum under the extended Neyman-Pearson criterion, has been adopted 
by the receiver. We shall now demonstrate that Q d [z] is a convex U function of the 
set i = (z u z 2 , ... , z M ) as constrained by (7-41). That is, if z A and z B are two such 
distributions of prior probability among the M signals and if \l a and fx B are two 
constants in (0, 1), with ^ + u, s = 1, then 

Qdz] = QAWa + ^z B ] < \l a Q d [z A ] + \x, B Q d [% B ]. (7-45) 

From this convexity it follows that a least favorable distribution exists. 

Consider two scenarios S A and S B . Under S A the transmitter in a binary 
communication system sends the signals Sj(t) ~ s(t; fy)— when it sends any signal at 
all — with the relative frequencies 

z A - \z\ , z 2 , ... , z M j; 

under S B it sends them with relative frequencies z B . It uses scenario S A with relative 
frequency \l a and S B with relative frequency u. B = 1 - \l a . The actual set of rela- 
tive frequencies of the signals under hypothesis H\ is then u.^ + \l b z b ~ z. The 
transmitter adopts one scenario or the other whether the current message symbol is 
a or a 1; if it is a 0, of course, no signal is transmitted. 

The transmitter informs the receiver which scenario it is using in any given 
symbol transmission by sending a symbol u = or u - I over a noiseless channel, 
u = indicating it is using scenario S A and w = 1 indicating it is using S B . The 
receiver bases its decisions on the data v^v^ ... , v n and on the variable u. Ifu ~ 0, 
it forms the average likelihood ratio 
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j=i 

If « = 1, it forms the average likelihood ratio 

£ rm 

i=l 

These are compared with decision levels A^o and Abo, respectively, each chosen to 
yield the preassigned false-alarm probability Qq. The average probability of correct 
decision under hypothesis H\ is then 

Qd = PaQ&aI + fuS/tol (7-46) 

and the false-alarm probability is \l a Qq + m-bQo = Go- 

An alternative strategy for the receiver is to disregard the datum u. The prior 
probability of the y'th signal under hypothesis H\ is then equal to 

- - W . (*) 

and the receiver must form the average likelihood ratio 

m 

comparing it with that decision level Ao that yields the false-alarmjprobability 
Qq. This alternative strategy attains an average detection probability (^[z], with 
z = \i a za + u,fi z e- This average detection probability cannot, however, exceed the 
detection probability in (7-46), for in ignoring the datum w the alternative strategy 
has discarded part of the available information on which the decisions might be 
based. The inequality in (7-45) must therefore hold, and the average detection prob- 
ability ~Qd\j\ must De a convex U function in the space of the distributions z of prior 
probability constrained by (7-41). 

Because the average detection probability Qa\A is convex U in z, it must possess 
a unique minimum Qd = Qd[z] for some prior distribution 2 = (zi, ... , z M ) within 
or on the boundary of the admissible region 

M 

< zj < 1, 1 < j < M, £ Zj = 1. 

We call I the least favorable distribution of the prior probabilities of the several 
signals Sj(t). There may be a number of prior probability distributions % yielding the 
same minimum average detection probability £> rf [z]; if so, they form a convex set in 
the sense that any weighted average of them will attain the same average detection 
probability. This would be the case, for instance, if certain of the parameters 6 were 
irrelevant. We disregard this possibility henceforth. 

If a uniformly most powerful test of hypothesis Hi against hypothesis H$ exists 
and if it is characterized by the probabilities Qi = &/(9,) of detecting the signals, 
I < / < M, the least favorable distribution will assign probability 1 to the signal for 
which Q, is least. (If several signals share a common minimum detection probability, 
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the prior probabilities z-, can be distributed among these signals arbitrarily.) Such is 
the case, as we have seen, when the signals have the same form, but different positive 
amplitudes; the weakest signal will be assigned prior probability 1. Henceforth we 
exclude the existence of a uniformly most powerful test. 

When the minimum Q d occurs at a point z lying entirely within the admissible 
region, we can find z by the technique of Lagrange multipliers. We are seeking a 
stationary value of (7-42) under the constraints of (7-38), (7-41), and (7-43). We 
therefore try to find a value of the quantity 

r = X Z J Pj{v)d n v - X p (v)d"v -\lYzj (7-47) 

that is maximum for a variation in the decision surface D[z] and minimum for a vari- 
ation in the components z- } of z. These variations can now be made independently. 
The position of the surface D[z] and the values of the z/s for which the stationary 
value is attained are functions of the Lagrange multipliers X and ji, whose values 
are later chosen so that the constraints in (7-38) and (7-41) are satisfied. 

First taking the z/s as fixed, we vary the surface D[z] until the quantity T is 
maximum. As in our analysis in Sec. 1.2, the maximum is attained when the decision 
surface D[z] is one of a family of surfaces described by the equation 

A(v) = V = X', v e D[z], < X' < co. (7_48) 

pi Poiv) v ; 

When the decision surface D[z] is given by (7-48), any small variation in it will 
produce a decrease in the quantity T that is of second order in the magnitude of the 
change in the position of the surface, as is usually the case with stationary values. 

Assuming now that the regions Rq[z] and Ri[z] are separated by a surface D[z] 
given by (7-48), let us vary each zj in (7-47) so as to minimize the quantity I\ This 
variation will cause a change in the surface D[z], as well as in the z/s appearing 
explicitly in (7-47), but the effect on T of the variation of the surface D is of second 
order because it has been set in accordance with (7-48), and it is only the variations 
in the explicit z/s that matter. In this way we obtain the set of M equations 

ar f 

5- = Pj (v) d n v - p s 0, J = 1,2, ... ,M. 

02 'J J*i[zj 

These equations assert that for the set z = {zj} of prior probabilities that we are seek- 
ing, the Neyman-Pearson strategy yields a detection probability for the yth signal (or 
set of parameter values 8/) that is the same for all signals, Qj ss - 1, 2, ... , M. 
Along with (7-48) and the constraints of (7-38) and (7-41), these equations suffice 
to determine the set z of prior probabilities. 
Let us now write (7-47) as 

M 

r = Z Z J<QM - M-) - ^QoW (7-49) 

•/' =1 

and assume that for each set z = (z lt .... , z M ) the Neyman-Pearson detector (7-48) 
that suffers a false-alarm probability Q is being utilized. Suppose that the minimum 
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value of Qj[z] occurs for a set z on the boundary of the admissible region, some of 
the prior probabilities z\ being zero. 

Consider a set of infinitesimal variations or; in the prior probabilities zj\ by 



The change in F is then 



M 



Sr = J] BZ;(0[Z] - |t), 
7=1 

for as we have seen, the change in 

M 

Z^0[z]-XQ O [2] 
7=1 

due to the alteration of the decision surface D under the change 5z is of second 
order and negligible. If V is to be minimum, this change 8T must be nonnegative. 
If Sj > 0, bzj can be either positive or negative, and Qj - ji, must vanish. If Zj = 0, 
however, bzj can only be positive, for the variation must not take the point z outside 
the admissible region. Then in order for T to be minimum, Qj[z] - u, must be 
nonnegative, and we obtain the conditions 

Qiffl > zi = 0, 

Qi[z) = u,, ^ > 0. 

From these it follows that 

M 

and we can write the conditions for the least favorable distribution as 

am^Qj, i, = o, 
e/[zj ■ a/, > 0. 

If the receiver is designed to meet the extended Neyman-Pearson criterion for the 
least favorable prior distribution, it attains the same probability Qd of detecting 
all signals j/(f) whose prior probabilities are positive. The other signals can be 
detected with greater probability Qj[z], but the distribution z assigns them zero prior 
probability. 

When detecting signals in Gaussian noise, the ratio pj(v)/po(v) can be replaced 
by the likelihood functional 

Aj[v(t)} = mo\ <w 

by passing to the limit n —» 00 of an infinite number of samples of the input v(t); the 
likelihood functional is given by (2-71) with s(t) = s(t; By). The decision is based 
on the average likelihood functional 

A[v(t);z) = ^ZjA[v(t)\%l 

y=i 
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AMOl 6y] = exp 



which is compared with a decision level A whose value is picked so that the false- 
alarm probability Qq equals the preassigned value. . 

As an example, suppose that it is unknown whether the amplitude of the 
signal is positive or negative. Denote the sign of the signal amplitude by 6,-; Gj = 1, 
2 = -1. Now M = 2. Under hypothesis H\ the signal is either s\{t) '= s{t) or 
s 2 (t) = -s(t). The signals are to be detected in the presence of white Gaussian noise 
of unilateral spectral density N. The likelihood functional arc then 

le^MOA-^W*]. 7 = 1,2, 

by (2-74). The symmetry of this problem leads to the conjecture that in the least 
favorable situation positive and negative signals are equally likely: z t = h = ~. The 
decision is then based on the average likelihood functional 

A[d(/);z] = JA[»(0I e,] + \h\v(t)\ G 2 ] 
= e~ <h/1 cosh g, 

where 

This average likelihood functional is a monotone function of the absolute value \g\ 
of the output, at time t - 7, of a filter matched to the signal s(t). The receiver 
therefore compares !g| with some decision level gj > 0, deciding that a signal is 
present if \g\ > g\. The false-alarm probability 



Go = Pr(|g|>gi|H ) = 2erfcx, 



d ' 

determines the appropriate value of the decision level g, . The probability of detecting 
either signal is now 

a/(e / ) = Pi-(|^>gi|w, I e y ) 

= erfc(.v - t/) + erfc(.v + d), 

and it is the same for both signals, that is, independent of the true value of the 
parameter (the sign of the signal), as required by (7-50). This probability of detec- 
tion, for a given false-alarm probability Q 0t is smaller than the detection probability, 
determined in Sec. 2.2 for a signal of known sign. 

The criterion (7-50) enables us to test whether a conjectured least favorable 
distribution is correct. Consider a communication system, such as the one we have 
been discussing, in which the l's in a message are transmitted by sending either one 
or the other of two orthogonal signals, .v,(/) or s 2 (t). These arc received in white, 
Gaussian noise of unilateral spectral density /V, and their input signal-to-noise ratios 
are 



4 = ^j" [Skit)?*, 



l< - 1,2. 



I! d\ = di , the least favorable distribution assigns equal probability to each signal: 
-i — -2 — 1/2. The reader can easily verify that the resulting receiver will detect 
each signal, when it is present, with equal probability; Q x - Q 2 . Suppose, however, 
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that df < d%. One might think that the least favorable distribution would put all 
the probability , on the first signal: z\ - 1, zj ~ 0. The receiver would then consist 
only of a filter matched to followed by a decision device in which its output is 
compared with a decision level set to provide the preassigned false-alarm probability 
Qq. Such a system would detect the second signal, however, with a probability 
Qi - Go, and this would be less than the probability Q\ of detecting the first, in 
violation of the first line of (7-50). The least favorable distribution must in this case 
assign some positive probability 22 > to signal si(t). Just how much it is difficult 
to calculate. 

Our prescription (7-50) does not depend on the number M of points in the 
parameter space @, and we can allow that space to become a continuum with an 
infinite number of points. It is only necessary to limit the total range of values of 
each parameter. The receiver bases its decisions on an average likelihood functional 

A[v(t); z] = f z(8)A[u(/); 8] d m B, (7-51) 

where A[v(t); 8] is the likelihood functional for detecting the signal s(t; 6) in the 
noise. The probability of detecting this signal depends on the prior probability 
density function z(8) built into the receiver strategy, and we write it 

&/(8;z) = Pr{A[i<0; A > A | H u 6}. 

From (7-50) it follows that the parameter space may be found to be divided 
into two disjoint regions ©+ and @o, which may or may not be simply connected. 
In the region @+ the least favorable prior probability density function z(8) of the 
parameters is positive, and the probability Qd($; z) of detecting the signal s(t; 6) is 
independent of 8: 

> 0, QM z) ^jmQ d (&; 2) d"'Q' = Q d [zl 8 G + . (7 . 52a) 

In the complementary region © the least favorable distribution z(Q) vanishes, and 
if we were playing a game against a malevolent adversary, we should not expect 
him to send us signals with parameter values in this region 0q; for if such a signal 
did arrive, its probability Q<f(Q; z) of detection would be greater than the uniformly 
minimum value for region 0+: 

2(8) = 0, g rf (e; z) > Q d \z\ 8 £ O . (7-52b) 

These detection probabilities Qd(Q; z) are achieved by a receiver that forms from the 
input v(t) the average likelihood functional 

A[v(t)iz] = f z(Q)A[v(t)\Q]d m Q 
J© 

and compares it with a decision level Ao fixed by the preassigned false-alarm prob- 
ability. 

It is usually very difficult to calculate the least favorable prior probability den- 
sity function z(8) if it cannot be discovered immediately through some symmetry, 
invariance, or natural ordering of the parameters of the signals. In Sec. 3.3 we 
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Figure 7-4. Narrowband spectral density of the noise. 

analyzed the detection of a narrowband signal of unknown phase ifi under the as- 
sumption that the prior probability density function r(if) was uniform over (0, 2tt). 
The probability QAty) of detecting a signal of given phase if was shown in Sec. 3.4 
to be independent of if, and the criterion (7-52) verifies that that uniform prior dis- 
tribution is indeed the least favorable distribution of the phase if of the signal. 

In general the criterion (7-52) serves either to confirm or to disprove that a 
hypothesized prior probability density function c(6) is least favorable. Consider, for 
example, the detection of a quasi harmonic signal 

of unknown phase if and unknown carrier frequency il + it' in the presence of 
colored Gaussian noise. The parameter w, representing perhaps the Doppler shift of 
a signal from a transmitter of unknown velocity, is confined to a band -B < w < B. 
The narrowband spectral density Ofto) of the noise might resemble that in Fig. 7-4. 
Suppose that the range IB of the Doppler shift w is somewhat greater than the 
bandwidth of either the signal or the nonwhite component of the noise. 

One might think that the signal would least favorably arrive with a Doppler 
shift iv = 0, that is, in the midst of the strongest part of the spectral density of the 
noise, so that the least favorable prior density function of the parameters might be 

z ( v { f) v ,,) = ~?,(w), < if < 2ir, —B < w < B. (7-53) 

The optimum detector would then be the same as that worked out in Sec. 3.3 for de- 
tecting the signal s(t; if) - s{t; if, 0), whose carrier frequency is completely known. 
If we calculated the probability Q e! (ty, iv) of detecting a signal s(t; if, w) with a Dop- 
pler shift u' near the outer limits ±B of its range, however, we should find it indeed 
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independent of the phase t|/, but not of m>, and furthermore, QrfOk w) would be less 
than Qd($, 0), the average detection probability under the conjectured prior density 
function z(^, w) in (7-53). This contradiction of (7-52) establishes that the conjec- 
tured least favorable distribution was wrong. Because of the enormous difficulty 
of calculating the detection probability Qd(ty, w) for the optimum detector arising 
through (7-51) from an arbitrary prior probability density function z(*j>, w), this de- 
tection problem remains unsolved. In the next part, however, we shall describe how 
we can obtain an approximation to the least favorable distribution by considering 
the threshold detector introduced in Sec. 3.6.3. 



7,6.2 The Threshold Detector 



A method for determining a prior probability density function z(0) of the unknown 
signal parameters that is at least approximately least favorable rests on the assump- 
tion that the signals are so weak that the receiver can use the threshold statistic 
introduced in Sec. 3.6.3. As we saw in Sec. 4.4.2, this statistic is most appropriate 
when the receiver can base its choices between hypotheses Hq and Hi on a large 
number M of statistically homogeneous and independent inputs Vk(t), I <k < M, 
< t < T. As in that section, we designate the input signal strength by a, defining 
it in such a way that the nonvanishing term of lowest order in an expansion of the 
likelihood functional in powers of a is proportional to a itself; see (3-108) and (4-71). 

We again isolate the signal strength a from the other unknown signal parame- 
ters 0, and as at the end of Sec. 4.4.4, we write the likelihood functional for detecting 
the signal s{t\a, 8) in any one of the M inputs v k {t) as A[v{t); a, 0]. As in (4-77) 
through (4-79) the threshold statistic is now 

M 

G a [{vj(t)}; z] = £ g[v k (t)\ z], (7-54) 



where 

g\yyiy, zj - inn 

with 



g[v(t); z) = lim [a~ l {A[v(t); a, z] - 1}], (7-55) 

a— »0 



A[v(t); a,z] = { z(Q)A[v(t); a, 0] d m Q. (7-56) 
J© 

We are explicitly indicating the dependence of the threshold statistic on the accepted 
prior probability density function z(0) of the signal parameters. 

When M » 1 the probability density functions of the statistic in (7-54) are ap- 
proximately Gaussian by the central limit theorem, and the false-alarm and detection 
probabilities are approximately 

Qo m erfc *, 0,(8; z) * erfc(* - VM^(0; z)), (7-57) 

in terms of the error-function integral (1-1 1), where as in (4-68) 

Wz)f = !«^J«, g=elmz] , ( 7 . 58 ) 

is the effective signal-to-noise ratio for each input v k (t). As before, Varo indicates 
the variance of a random variable under hypothesis Hq. 
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The values of the unknown parameters 9 appear in (7-57) only through the 
expected value E{g[v{t); z]\ H,, 8} of the statistic defined in (7-55). The criterion 
(7-52> therefore, implies that in the limit a — of weak signals the least favorable 
distribution z(6) in genera! divides the parameter space into two disjoint regions 
6+ and 6o, such that for some constant 

E{g[v(t); z}\ H u 0j s ah, z(6) > 0, G © +I 

E{g[v(t); z]\ Hi, 6} > ah, 5(0) s 0, fl£ O . 

The constant h is determined by 

h = a' 



(7-59) 



z(Q)E{g[v(t);z}\ #,,8} « 
© (7-60) 

= a- ] E{g{v{ty,z-]\B u z}. 

The notation "#i,z" indicates an expected value with respect to the distribution 
z(8) of the unknown parameters under hypothesis H\. Henceforth we shall refer 
to the prior probability density function = (8) satisfying (7-59) as the least favorable 
distribution, suppressing the qualification "in the limit a —> 0." 

Recalling that the likelihood functional is the limit of the likelihood ratio when 
the number n of data v grows beyond all bounds, 



with 
we find 



A W /); a. z] = lim ^r 1 , 

Pi(v;a,z) = { z(V) Pi {v\a,B)d m B, 
Jo 



p\(v; a, z)d"v = 1, 



E{A[v(t); a, z}\ H } = lim 

so that E{g[v{l); z}\ H {) ] = 0. Furthermore, with a « 1, 

P\(V, a, z) 



ElgMtXz^H^z} = a' ] lim f 



- 1 



/Ji(v; a, z) d"v 



Po(v) 

= a- ] E[{A[v({); a, z] - l}A[v(t); a, z]\ H Q ] 

= c-'£[{A[D(/); a> r]-l} 2 |ffo] 
= a Var g[u(0; 

[Compare (4-75) and the argument leading up to it.] 

For the least favorable distribution z(8), therefore, (7-60) yields 

h = Var g[v(t); ■], (7-61) 

and by (7-58) the effective signal-to-noise ratio is bounded below by 

[£> g (d; z)f > a 2 h = a 2 Var g[v(t); z], 

with equality for parameters 8 in the region &+ where z(8) > 0. Comparison with 
(4-68) shows that the quantity k represents the efficacy of the threshold detector 
with respect to a as the measure of input signal strength, the detector having been 
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designed on the basis of the least favorable distribution 2(6) of the unknown signal 
parameters. 

7.6.3 Narrowband Signals in Gaussian Noise 

Let the signals to be detected during (0, T) have the form 

s(t;A, 8) = AReF(t; *)e tsl,+i \ 

and assume that the phase v|i has its least favorable distribution, the uniform one over 
(0, 2ir). Let the narrowband noise Re N(t) exp i£lt have the complex autocovariance 
function 

$(t,s) = lE[N(t)N\s)l (7-62) 
Define G(t; 8) as the solution of the integral equation 

F(t; 8) = [ <|>(r, s)G{s; fyds, < t < T. (7-63) 
Jo 

Then as in (3-53) and (3-60), the likelihood functional for detecting the signal 
s(t; A, 6) in the input Re V(t) exp iSU to the receiver is 

A[v(t); A, 8] = exp[-^ 2 /(e)]/ (^r), 

where T 

r = | G*(t;Q)V(t)dt, 

T G*(t;Q)F(t;B)d ti 



7(8) = [ 
Jo 



after averaging over the phase By using the series expansion (3-61) of the modified 
Bessel function, we can expand this likelihood functional A[v(t); A, 8] in powers of 
A, and we find 

A[v(t); A, 8] = 1 + \A 2 [r 2 - 2/(8)] + 0(A 4 ). 

Comparison with (7-55) prompts us to take a - \A 2 as our measure of the signal 
strength, and the threshold statistic for a single input v(t) is then 

g [v(t); A = f zmw, 0] rf w e, (7-64) 

J© 

with 

g[v(t);%] = ±[r 2 -2/(8)] 

, C T f f r (7-65) 

= iJ o G*(t;9)V(t)dt\ -J o G\v^)F{t;%)dt. 

For this random variable, E{g[v(t); 8][ Ho) = 0, as can be shown by replacing V(t) 
by N(t) and taking the expected value with the use of (7-62). 
When a signal s(t; A, 8) is present, 

V(t) = AF(t; 8) e iatH * + N(t), 
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and as the reader can easily show by substituting into (7-65), 

E{gMO; ei]|ffi,e} = ar(ei,e) f 

where 



r(o i5 e 2 ) = 



<?*(/; 6| )/"(/; B 2 )dt 



(7-66) 



plays a central role in this problem. In terms of it, by (7-64), 

E{gWY, z]\ H u 8} = a f r(e,)IX0,, 8) </'"8 t . 
J© 

Putting this into (7-59), we find that the criterion for the least favorable distri- 
bution 2(8) in the weak-signal limit is 

f 5(e,)r(6i, e) d m Bi = h, f (8) > o, e g e + , 

7 (7-67) 
2(80r(8,, 8) <Te, > A, z(8) s0, 8 G O . 

J© 

The solution • ) of these equations will be proportional to h, and h is determined 
by the normalization 



^?(8K"e = l. (7-68) 

Furthermore, multiplying the equations in (7-67) by 2(8) and integrating over the 
parameter space @, we find 

h = f f i(8i)r(8i, 83)2(82) iTBi <T8 2 . (7-69) 

The reader should verify by using (3-44) that (7-61) holds for the statistic defined in 
(7-64) and (7-65). 

The kernel IY8], 82) is a kind of generalized ambiguity function. Indeed, if the 
unknown signal parameters are the arrival time t and the frequency shift w, 



.IU'( 



F(t; t, w) = F{t ~t)e' 

and if the noise is white with unilateral spectral density N, so that 

G(t;j, w) = -AT'FfoT, w), 

then as the reader can easily show, 

F(ti, w x ; t 2 , tv 2 ) = ^V" 2 |X(tj - t 2 , if 1 - w 2 )\ 2 , 

in terms of the complex ambiguity function defined in (6-96). 

The conditions (7-67) are the same as arise when one minimizes the quadratic 
form on the right side of (7-69) under the constraints (7-68) and 2(8) > 0, V8 G 
The only practical way to solve such a problem seems to be to sample the parameter 
space @ at a finite number of points and approximate the integral in (7-69) by 
a double summation and the integrals in (7-67) and (7-68) by single summations 
over values of 2(8) at those points. The minimization then becomes a problem in 
quadratic programing [Col71]. 
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As an example, suppose that the only unknown parameter is the frequency 
shift w of a narrowband signal having the form 

s(t;w) = AReF(t)e iiSl+w)lH ^. 

Its complex envelope is taken as 

F(t;w) = F(t)e iwl . 

The parameter w is assumed to lie in a finite band (~B, B), and as described at the 
end of Sec. 7.6.1, the narrowband spectral density 4>(w) of the noise is nonuniformly 
distributed over that band. In [HeI92b] this problem was treated under the assump- 
tions that the noise is stationary and that the observation interval is so much longer 
than the duration of the signal that it can be taken as (-<», °o). Then the integral 
equation (7-63) becomes 



F{t)e ,m = f $(/ ~s)G(s;w)ds, 

J -00 



where <j>( ■ ) is the complex autocovariance function of the noise. As in Sec. 2.2.2, we 
can solve this integral equation by Fourier transformation, and the Fourier transform 
g((a; w) of G(s; w) is found to be 

g(o>; w) = ' 
<J>(a>) 

in terms of the Fourier transform /(o>) of F(i). The kernel defined in (7-66) becomes 



r(n>i, w 2 ) = 



L 



2 

'/*((!) - W\)f((3i — W2) doi 



<P(g>) 2ir 

In [Hel92b] the noise was taken as the sum of white noise and noise with a Lorentz 
spectral density of bandwidth jj,. Without loss of generality the spectral density of 
the white component was set equal to 1, whereupon 

*M = • + 4^. 

with P measuring the relative strength of the nonwhite component. The signal was 
taken to be rectangular. 

The frequency band (~B, B) was sampled at a number of uniformly spaced 
frequencies wj in order to convert the integrals in (7-67) and (7-68) to summations, 
and the quadratic programing problem of finding the samples z{wj) of the least 
favorable distribution was solved numerically by a method due to Wolfe [W6159]. 
It transpired that the least favorable distribution is concentrated in a number M of 
weighted delta functions: 

M 

z(w) = J c*8(h> - w k ), wi = -B, w M = B. (7-70) 

The number M and the approximate locations of the delta functions were determined 
by the quadratic-programing algorithm. 
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For a prior probability density function of the form in (7-70), the first set of 
equations in (7-67) becomes 



M 



]T c k T(wj, ii' A .) s /?, 1 <j < M, (7-71) 



when the left side is evaluated at the M points e = w h By (7-69) we must minimize 
the quadratic form 

M M 

j=i fc =i (7-72) 
w = (in, ... , w M \ c = (ci, ... , c M ), 

under the constraints 

M 

c k > 0, 1 <k < M, £q = 1. (7-73) 

A = l 

Starting with values of w provided by the quadratic-programing algorithm, 
(7-71) was solved with (7-73) to determine the set c of coefficients more accurately. 
These were substituted into (7-72), and a gradient algorithm was used to minimize 
the quadratic form by varying the M frequencies w k . With the new set w, a new set 
c of coefficients was obtained from (7-71) and (7-73). The procedure was repeated 
until the value of <?(w, c) ceased changing significantly. Details of the computations 
can be found in [Hel92b]. 

The detector optimum in the weak-signal limit now consists of a bank of M 
narrowband pass filters matched to signals 

a(t; w k ) = Re G(t; w k ) e ia < , 1 < k < M. 

After these are quadratic rectifiers whose outputs are weighted with the prior prob- 
abilities c k to form the threshold statistic 



S = |Yc* G*{T' -s,w k )V{s)ds 



where T' is a delay long enough to encompass the signals o(r; w k ). The value of S is 
compared with a decision level set to induce the preassigned false-alarm probability 
Qo. When a number of inputs vj(t) arc available one after another, the resulting 
values of the statistic S are summed before comparison with the decision level. The 
performance of the system is measured by the minimum value h of the quadratic 
form Q(w, c); it decreases with an increasing width IB of the expected range of the 
frequency shifts u\ 

Problems 

7-1. Compare the rms bandwidth (Aw 3 ) ,/2 /2ir defined through (6-85) and the bandwidth 
W defined by (7-11) for strictly bandlimited signals with spectrum /(to) uniform in 
(-it W, TrW), for Gaussian signals, and for rectangular signals. 

7-2. Show from (7-7) that for a strictly band limited signal Re F(i) exp iClt with a uniform 
spectrum over the range -txW < kj < -nW, samples of the complex envelope V (t) of 
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the output of the filter matched to the signal are uncorrelated when separated in time 
by multiples of l/W. 

7-3. When the time-bandwidth product L - WT is large, the statistic g' A in (7-6) has ap- 
proximately a Gaussian distribution, and the false-alarm and detection probabilities are 
approximately 

Qo * erfc x> Q<t « erfc(x - D), 

where D 2 is the effective signal-to-noise ratio. Evaluate D 2 in terms of the signal energy 
E, the spectral density TV of the white noise, and the time-bandwidth product WT. 
With this Gaussian approximation, calculate the input signal-to-noise ratios 5 (dB) 
needed to attain a probability of detection Q (t = 0.999 when WT = 10, 000 and the 
false-alarm probabilities Q are 10~ 4 , 10" 6 , 10~ 8 , and 10~ 10 . Compare your results with 
the values shown in Fig. 7-1. 
7-4. Using the counting functional that we employed in deriving Rice's formula (7-17), work 
out an expression for the variance Var N+ of the number N+ of times that a process 
r{t) crosses the level a in the upward direction during an interval of duration T. 

7-5. The signal s(t) ~ 6/(r) is to be detected in Gaussian noise having autocovariance func- 
tion <(>(/, u) by observation of the input v(t) to the receiver during an interval (0, T). 
The sign 6 of the signal is either +1 or -1 with equal probabilities. For a false-alarm 
probability Q ~ 10~ 6 and for detection probabilities Q d = 0.9, 0.99, and 0.999, cal- 
culate the signal-to-noise ratios d 1 required. Compare these with the signal-to-noise 
ratios needed for the same values of Qo and Qa when the sign 6 of the signal is known 
to be e = +1. Determine those ratios in decibels and compare with the corresponding 
"losses" when the phase ij) of a narrowband signal is. unknown, as plotted in Fig. 3-5, 

7-6. In a binary communication system transmitting 0's and 1 *s every T seconds, no signal 
at all is sent when a is the message symbol. When a 1 must be dispatched, the 
transmitter sends either one, but not both, of two signals, which we label A and B. 
These suffer Rayleigh fading during propagation to the receiver. Signal A is received 
as 

s A {t) - A Re F { (t) e^ 1 *'* 

and signal B as 

s B {t) - A ReF2(0c /n,+/ *, 

in which the phase ij; is random and uniformly distributed over (0, 2tt). When signal A 
is sent, the signal-to-noise ratio d has the Rayleigh distribution 

z(d) = 4 exp(~W) 

as in (4-17); d is proportional to the random amplitude A. When signal B is sent, the 
signal-to-noise ratio d has the distribution 




with s\ ^ si in general. The signal envelopes F\ (t) and J*2(0 are orthogonal. Whenever 
it needs to transmit a 1 , the transmitter sends signal A with probability z A and signal 
B with probability zb\ z a + z& = 1. The receiver does not know which signal, if any, is 
transmitted. The signals are received in white Gaussian noise with unilateral spectral 
density N. 

(a) Determine the optimum strategy under the Neyman-Pearson criterion by which 
the receiver should decide whether a or a 1 was sent. The average detection 
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probability Q tl is to be maximum when the false-alarm probability has the preas- 
signed value Q . Describe how your strategy might be implemented. Show that the 
decision can be based on the values of just two random variables it'i and \v 2 that 
have exponential distributions under each hypothesis. [An exponential distribu- 
tion is one for which the density function has the form a exp(-mr }(/(«■*} for some 
positive constant a.] To illustrate the detection strategy, draw a diagram of the 
(if i, w 2 ) plane and its division into the two regions R {) and R\ in which hypotheses 
Ho and Hi are respectively chosen. 

(b) Show how to calculate forjhe optimum strategy the false-alarm probability <?„, 
the probabilities Q dA and Q JB of detecting signals A and B, respectively, and the 
overall average detection probability Q :l . Reduce each of your expressions for these 
probabilities until they involve a single integration, but do not attempt to evaluate 
these single integrals, which would probably have to be computed numerically. 
One could then search for the values of z A and % B for which Q d is minimum by 
applying the criterion in (7-50). 

(c) Now assume that s 2 A « 1 and sj « 1, and determine the prior probabilities z A 
and z fi that are least favorable in the sense of Sec. 7.6.2. The ratio s 2 A ■ s 2 B is 
assumed to stay fixed in the passage to the limit sj -» 0, si ~ r 0. Calculate the 
value of h in (7-60) in terms of sjj and si- 

7-7. Calculate Var g[v(t)i r] for the threshold statistic g[v(t); -} as given by (7-64) and (7-65) 
by using (3-44), and show that your result equals the value of A in (7-69). 
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8 



Detection of Signals Under 
Conditions of Uncertainty 



8.1 DETECTION IN NOISE OF UNKNOWN STRENGTH 

The detection of signals having unknown parameters was introduced in Sec. 3.6, and 
our study continued in Chapter 7 with special attention to signals with unknown 
arrival time and Doppler shift. Now we take up the detection of signals under other 
kinds of uncertainty, beginning in this section with noise of unknown strength and 
in the next section treating detection in noise whose probability density functions 
are only vaguely known. 

8.1 ,t The CFAR Receiver 

When a radar system is being jammed by broadband noise from an enemy transmit- 
ter, it is faced with the problem of detecting echo signals in the presence of noise 
of unknown spectral density. We assume here that that spectral density is so much 
broader than the spectra of the signals themselves that it can be considered uniform 
over the frequency band of the echoes, and the noise is effectively white. Were there 
no other signals about except our echo signals and the enemy's noise and if we knew 
the shape of its spectral density, but not its total power, we could in principle' — as 
will be seen in Problems 8-1 and 8-2- — measure that power exactly and detect the echo 
signals with the same reliability (Qq, Qj) as though the strength of the interfering 
noise were known a priori. Conditions are in reality never so ideal. The approach 
usually taken is to estimate the spectral density N of the noise, assumed white, and 
to substitute that estimate for the true value of N in the specification of the decision 
level with which the output of the receiver is compared. 
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The receiver will be assumed to contain a threshold detector; that is, it adds 
the quadratically rectified outputs of a filter matched to the signal during M suc- 
cessive intervals, as in Sees. 4.2 and 7.2. We now assume, as described in Sec. 7.1, 
that the rectified output | V (t)\ 2 of the matched filter is sampled at L = WT times 
during each of the M intervals (T', T + V) in which it is observed; again W is the 
bandwidth of the signal, defined as in (7-13), and V is the delay in the matched 
filter, V <5C T. In radar parlance the interval is divided into L range bins, and in 
effect we assume that an echo will appear in one and only one of these subintervals 
of duration W~ x = T/L. We shall deal with a single such sample or range bin in 
each interpulse period, and our analysis will be the same as though the receiver were 
deciding about the presence or absence of a target at a fixed, predetermined location. 

If the target might be moving, the receiver must again contain a bank of parallel 
filters matched to signals with a number of discrete carrier frequencies O + kbw, 
spaced across the spectral band in which echoes can be expected to appear, as in 
Sec. 7.5. The instants at which samples are taken during each of the M observation 
intervals must then be adjusted to compensate for motion of the target between 
transmitted pulses. Under the presumption that this has been done, we can treat the 
detection in the same manner as though the target were stationary. 

The receiver must choose between the two standard hypotheses 

v k {t) = n k (t), 1 <k < M, (Ho) 

and 

v k (t) - n k (t) + s k (t), 1 < k < M, 
s k (t) = A Re F(t) expOn* + 

where the phases \\i k are independently random and uniformly distributed over 
(0, 2ir). The noise inputs n k (t) are white, Gaussian, and independently random 
from one input to another. The receiver forms the decision statistic 

M 

V = i£ k-P, (8-1) 

Jt = 1 

as in (4-19), with 

f 7 " 

z* = * k + iy k = C F\t)V k {t) di, (8-2) 
Jo 

As before, V k {t) is the complex envelope of the kth input, and (0, T') is an in- 
terval long enough to contain the entire signal. The components jc* and y k are 
independent Gaussian random variables. Under hypothesis H Q their expected values 
are zero. Under hypothesis H u with the normalization constant C suitably chosen, 
their expected values are given by 

£(2*1 //[) = E(x k \ Hi) + iE(y k \ = d k exp nfe, 
as in (4-23), where now 

2£ 

d k = — f , 1 < k < M, (8-3) 
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with E k the energy of the signal appearing in the kXh observation interval. In (8-3) 
N' is a "fiducial noise level introduced for the sake of normalization and taking the 
place of the true spectral density N, which is now unknown. 

Were that spectral density N known, the receiver would compare the statistic 
U with a decision level Uo proportional to N. Instead the receiver must estimate 
that noise level N. To that end it acquires a number M' of auxiliary inputs vj(t) 
containing only white noise that is deemed to be independent of the noise inputs 
nk(t) and to possess the same unknown spectral density N. These inputs v'j{t) will 
ordinarily be taken in spectral bands close to, but not overlapping that of the signals 
Sk(t), and care must be taken that no significant vestiges of these signals can appear 
in them. They are passed through suitable filters whose sampled outputs, much as 
in (8-2), provide M' complex samples z' k = x' k + iy' k , 1 < k < M' . In practice there 
are a number R of such auxiliary inputs and filters, the output of each of which is 
sampled during each of the M interpulse intervals, whereupon M' = RM. _ 

The components x k ,y k are independent Gaussian random variables with ex- 
pected values zero under both hypotheses H and H \ . These components are fur- 
thermore statistically independent of the M components x k and y k of the samples 
Zk - x k + iy k in (8-2), and they have the same variances. The statistic 



then constitutes an estimate of the spectral density N. The expected value of U', 
as the reader can easily show, is proportional to the actual value N of the spectral 
density of the noise, and with appropriate scaling U' becomes an unbiased estimator 



One replaces the unknown spectral density N, as it figures in the decision level 
U Q , by the observed quantity U', which is a random variable; and the receiver now 
decides that a signal is present in the range bin of concern if U > $U', where p 
is a constant whose value is specified by the preassigned false-alarm probability Qo 
- [Fin68]. Because this receiver suffers the same false-alarm probability no matter what 
the true value of the spectral density of the total white noise may be, it is called a 
constant false-alarm rate (CFAR) receiver. 

8.1.2 False-alarm Probability 

The false-alarm probability of this receiver is 



As shown in elementary texts on probability theory, the probability density function 
of the random variable p' = U/U' is given by 



where p v {-) is the probability density function of the random variable U in (8-1) 
and pu>(') is the density function of U' in (8-4) [Hel9l, p. 167], [Pap9l, p..l38]. 



M' 



(8-4) 



k=l 



of N. 
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With U and V suitably normalized, these are both gamma distributions with M 
and M' = RM degrees of freedom, respectively, 



_ y M-\ e ~y y M'-\ e -y 



Pu{y) = W^y. u(yX pw(y) = W^)i U{y) ' (8 " 5) 



under hypothesis H Q , and we find 



ilM-\ rOO 



p(£') = P y M + M<-l e HMy dy 



(M - 1)!(M' - I)! J, 

(M - \)\{M' - \)\ (1 + W) M+M ' 
which is known as the beta distribution. Then the false-alarm probability is 

(M + M' - 1)! r 00 -m- 1 



(8-6) 



Qo = 



y , 

m+m' y- 



(l +y) 



(M- 1)!(M'- 1)! 

This has been tabulated by Pearson [Pea68] as the "incomplete beta distribution." 
By introducing the variable x = (1 + y)~ l this is sometimes written as 



= ( M + M' - 1)! 
° (M - \)\{M' - 1)! 



■b 



M'-l/t ..\M-l j.. > - 1 



1 + ft 



By expanding (1 - x) M ~* by the binomial theorem, this can be written in closed 
form, but as the resulting series contains alternating signs, it is inconvenient for 
computation, particularly when M and W = RM are large. Section E.4 of the 
Appendix presents a series (E-10) due to Robertson [Rob76] that in typical situations 
converges rapidly and is preferable for computing the false-alarm probability. The 
relation of the statistic p' = V/U' to the F statistic of the analysis of variance is 
also described there. We turn to the problem of determining the decision level R 
that yields a preassigned false-alarm probability go- 
When M and M' = RM are large, the random variable V - U - $U', as the 
difference of two sums of large numbers of independent random variables, has ap- 
proximately a Gaussian distribution by virtue of the central limit theorem. Without 
loss of generality we can take N' = N = 1, whereupon 

E(V) = E(U)~$E(U') - M - RM' 
<j\ = Var(F| Hq) = Var U + R 2 Var U' = M + $ 2 M', 

Qo = Pr(V > Of Ho) * erfc (~^~) = erfc xq, 

= $ M ' ~ M 

X ° ~ (M + R2M') 1/2 ' 

By solving this equation for ft, one can usually determine an initial approximation 
to the decision level. No real solution exists, however, when M' < x 2 . An adequate 
starting value in this situation was found to be 



and 
so that 
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P = M' \M + M i/2 x ). 

Thereafter one can determine the exact value of the constant p by Newton's method, 
replacing each trial value of p by 

*T(P)^ 

where #o +) (P) = Pr(p' > p| #o) is computed either by Robertson's series (E-10) or 
by numerical contour integration as described in the next part; p(&) is the probability 
density function of p given in (8-6). When M or M' or both are large, Stirling's 
approximation 

can be used for the large factorials. 



8.1.3 The Probability of Detection 



The probability Qj that the CFAR receiver correctly reports the presence of a signal, 

Q d = Pr(U Z PC/'| HO = Pr(-^ > p| H^, 

is given by the complementary cumulative distribution associated with the noncentral 
beta distribution or with the related noncentral F distribution. References are to be 
found in Appendix E. Recursive methods are described there for computing the 
detection probability Qj for both signals of fixed total energy-to-noise ratio S and 
signals with fluctuating amplitudes for which S has a gamma distribution as in 
(4-39). When the numbers M and M' are large, the recursive methods can require 
lengthy computations liable to round-off error and underflow or overflow. 

Because the random variables U and U' are statistically independent, the 
moment-generating function of the random variable V = U - pt/' is 

h(z) = E{e- zV \ HO = E[e~ z(U ~^ u,) \ H,] 
= hv{z)h w (r?>z), 

in which h v (z) and hy<(z) are the moment-generating functions of U and U', re- 
spectively. With U', suitably normalized, governed by the gamma distribution in 
(8-5) under both hypotheses, 

h w {z) = (l + zT M '. 

When the total energy-to-noise ratio S is fixed, the moment-generating function 
of U is given by (4-24), 

and that of the statistic V is therefore 

h(z) = (l + zT M {\ - pz)-"' exp^-y^). (8-8) 
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If the signal amplitudes are fading, as described in Sec. 4.2.4, the moment-generating 
function in (8-7) must be averaged with respect to the distribution of the total energy- 
tq-noise ratio S, as in (4-33). If in particular S has the gamma distribution 

p(S) = ( S/S '^ t e s/S> U{S)t s' = ^ k>0, 
(k - \)\S k 

as in (4-39), we find from (4-41) the moment-generating function 

h(z) = (1 + z) k ~ M {\ + bzY k {\ - pzy, h = 1 + ££9 (8-9) 

k 

for the random variable V = U - $U'. In any case, the probability density function 
of V, by the inversion formula for Laplace transforms, is 

p(V)= I h(z) e zV j^-.; (840) 



C— (CO 



for (8-8), -1 < c < p" 1 ; for (8-9), -Zr 1 < c < (T 1 . 

By taking z with a negative real part, we can obtain the probability of detection 
by integrating (8-10) over < V < oo: 



Q<, = Pr(F > 0| H\) = 



. , dz 

c— /oo 



2~ x h{z) C<0. 

ZTT/ 



Taking z, on the other hand, with a positive real part and integrating over -oo < 
V < 0, we find 

rc+ioo , 

l-fl/ = Pr(r<0l/f,)= z~ x h{z)^ c >0. 

Jc— /oo 2ttz 

These integrals can be evaluated by numerical integration on a straight vertical con- 
tour through the saddlepoint of the integrand lying in the interval specified for the 
point z = c, as described in Sec. 5.2.1. Alternatively, the number of steps in the nu- 
merical integration can be reduced by integrating along a parabola as in Sec- 5.2.2. 
For false-alarm probabilities we set S equal to or b equal to 1. The probabili- 
ties of false alarm or detection can be approximated by the saddlepoint method as 
described in Sec. 5.2.3. The reader can easily work out the details. 

In Fig. 8-1 we plot the average probability 0<i of detecting a signal of fixed 
total energy E T versus D = -J2S, where S = E T /N is the encrgy r to-noise ratio, with 
N the unilateral spectral density of the noise actually present. Curves are shown 
for various numbers M 1 of independent observations of the noise alone; M = 20. 
Even with ten times as many samples known to be free of any signal component, 
M' - 200, the average probability of detection is still somewhat below the probability 
of detecting the signal in white noise whose spectral density is known a priori. 

In Fig. 8-2 we have plotted for three values of M the loss in decibels incurred 
when the spectral density N of the white noise is unknown and one must resort to 
this CFAR receiver. There S is the energy-to-noise ratio the receiver requires to 
attain a false-alarm probability Q Q = 10" 6 and a detection probability Q d = 0.999 
when it supplements M inputs with M' signal-free inputs; S is the energy-to-noise 
ratio required by the threshold receiver when the noise level N is known. The loss 
is plotted versus i? = M'/ M. 
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Figure 8-1. Probability Qd of detection in a CFAR receiver versus D = -J2S, 
unfading signals; 0o = 10" 6 . M ~ 20. Curves are indexed with the number M' - 
KM of signal-free observations. 



In practice it may be difficult completely to exclude the signals sk(t) to be 
detected from the auxiliary inputs Vj(t), and a certain number of the terms \z' k \ 2 
in (8-4) may, under hypothesis H\, contain signal components. The ones most 
likely to. be so corrupted will be the largest, and by raising the value of J3E/' they 
reduce the probability of correctly choosing H\ . It has been proposed, therefore, to 
eliminate some fixed number of the largest of the rectified outputs \z' k \ 2 from (8-4) 
and to base the estimate U' on only the remaining terms. The resulting receiver is 
called a censored mean-level detector [Ric77]. The calculation of the false-alarm and 
detection probabilities for such a receiver is difficult, for the terms remaining in U' 
are no longer statistically independent. It has been carried out for Rayleigh-fading 
signals by Ritcey, whose paper [Rit86] furnishes references to other work on this 
type of detector. 
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0.5 1 2 5 10 

R = M'/M 

Figure 8-2. Loss S/Sq in decibels, CFAR receiver; M ~ number of inputs that 
may contain a signal, M' the number of signal-free inputs; Qa = !(T 6 , Qd ~ 0.999. 
Curves are indexed with the value of M, 

8. 1.4 Colored Interference of Unknown 
Spectral Density 

At the beginning of Sec. 8.1.1 we wrote of an enemy jamming one's radar receiver 
by transmitting broadband noise of uncertain strength. The receiver was required 
to estimate the spectral density of the total noise by taking samples of its input in 
spectral regions where no components of the signals are expected to iie. If, 
however, the enemy has determined the form of the echo signals, it would be to his 
advantage to confine his noisy jamming signals to the same spectral band as theirs. 

Let us for simplicity consider a single range bin and a single input v(t) to the 
receiver. Its total noise will then be 

n(t) = n w (t) + nj(t), 

where n w {t) is white noise of known unilateral spectral density N, and rtj(t) is colored 
noise of narrowband spectral density <J>/((o) coming from the jammer. Its total 
available power 

is assumed known to the receiver. The signal to be detected is as before 
s(t) = A Re F(t) exp(/ftf + ity), < t < T, 
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with CI the carrier frequency and i|j a phase uniformly distributed over (0, 2ir). How 
should the jammer distribute its total available power P in frequency in order to 
reduce as far as possible the probability Qd with which the receiver detects this signal? 
That is, what is the least favorable spectral density %(w) from the standpoint of the 
receiver? 

We assume that the signal arrives during an observation interval that is so 
much longer than the duration of the signal that it can be taken as (-«, oo). The 
probability Q4 of detection as given by (3-75) depends on the signal-to-noise ratio 
d 1 in (3-53). With the interval (0, T) replaced by (-00, 00), we can go through the 
same analysis as led to (2-80) from (2-66) and (2-67). The total narrowband spectral 
density is now + <E>j(w), and we can write the signal-to-noise ratio as 

j2-^r 1/Ni 2 (8 . 12) 

with /(a) the Fourier transform of the complex envelope F(t) of the signal. In the 
absence of the interference this reduces to = 2E/N, where 

100 poo J 

is the energy of the signal. The least favorable spectral density 0/(o>) minimizes the 
signal-to-noise ratio d 2 under the constraints (8-11) and $>j(<a) > 0. This problem 
has been treated by Zetterberg [Zet62]. 

Introducing the Lagrange multiplier jx 2 , we combine (8-12) and (8-11) and 
minimize 



N 4- 2*,(») + V4v(w) I 2, 



At frequencies where the spectral density 4>/(w) is positive, we can find it by differ- 
entiating the integrand with respect to <£j(w), 

2I/(»)I 2 ....2- 



[N + 2Qj(to)Y 

whence 



+ 2\l 1 - 0, 



2 ^ (w) = !/(«)! - ^ > 0. (8-14) 

This will hold for all frequencies w £ E where $/(a)) is positive. At all other frequen- 
cies a) E E' t the spectral density <Ma>) must vanish. The value of ja is determined 
by the constraint (8-1 1): 

If[/( w )-nJV]^ =P. (8-15) 

The receiver, presuming that the jammer will utilize its least favorable spec- 
tral density 3>/(a>), passes its input through a narrowband filter whose narrowband 
transfer function, as in (2-86), is 
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(8-16) 



As before, 7*' is a delay long enough that (0, T') contains all the signal s(t). The gain 
characteristic |T(w)| will thus be uniform in the spectral region £; in the comple- 
mentary region E' it will be the same as for the matched filter designed for detection 
in white noise. 

The minimum value of the signal-to-noise ratio that the jammer can enforce is 
then, from (8-12), 

= 4 - ^ jV<«>l »/0»)l -MJV] ~, (8-17) 

where do is the signal-to-noise ratio in the absence of any jamming. 

If the jammer does not use its least favorable spectral density, but puts its 
total available power P into some other spectral density <J>i(w), while the receiver 
continues to use the filter prescribed by (8-16), the effective signal-to-noise ratio will 
be 

^Mtb^fa^sr (8 - ,8) 

and it is not difficult to show that d£r > The receiver is thus assured that the 
probability Q d of detecting the signal, for a preassigned false-alarm probability Q , 
will not be less than 

Qd = fiWnin, 6), 

where Q(- , •) is Marcum's Q function (3-76) and Q(0, b) = Q . For this reason 
the matched filter defined by (8-16) might be termed robust. 

In order to illustrate the effect on signal detectability of a jammer transmitting 
random noise with the least favorable spectral density $j(«), we assume that the 
signal has a Gaussian form, with spectrum 

where Aw 2 is the mean-square bandwidth. The region E is now defined by -o> < 
w < wo, where by (8-14) 

In Fig, 8-3 we show as a solid curve the relative signal-to-noise ratio d^m/di in 
decibels versus the ratio P/NW in decibels; W = Ato/V?? is the bandwidth defined 
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Figure 8-3. Performance of robust matched filter (8-16) for Gaussian signals ver- 
sus P/NW; P = total jammer power, N = unilateral spectral density of the noise, 
W = equivalent bandwidth. Solid curve: relative signal-to-noise ratio 
4min/^o (dB). Dashed curve: relative width am/trW . 



in (7-11). The dashed curve, referred to the right-hand scale, plots the relative width 
oio/itW of the least favorable spectral density. 

The subject of robust matched filtering has been extended well beyond the range 
of this example. For instance, spectral densities that are unknown, but constrained 
to lie between upper and lower limits 4>u(a>) and and to carry a prescribed 
total power, have been treated; and the possibility that the signal F(t) may be a 
distorted version of a given signal Fo(t) under a constraint such as 



has been taken into account. A broad review of this topic, with numerous references, 
was published by Kassam and Poor [Kas85]. 



8.2 NONPARAMETRIC DETECTION 

8.2.1 Parametric and Nonparametric Hypotheses 

Everything we did in the first seven chapters rested on the postulate that the statistical 
structure of the noise, embodied in its set of probability density functions, is given. 
Indeed, in all examples we took the noise to be of the ubiquitous Gaussian type, with 
expected value zero and a given autocovariance function — the only realistic kind of 
noise for which the multivariate probability density functions of arbitrary order can 
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even be written down. We also assumed that the signals, when present, combine 
additively with the noise, and that we know everything about them except perhaps 
the values of a limited number of parameters. When certain statistical properties of 
the noise are unknown, however, the detection problem becomes one of choosing 
between composite hypotheses, and a detection theory is more difficult to establish. 

There are various levels of ignorance to contend with. If everything is known 
about the probability density functions of the noise and of the sum of signal and noise 
except for the values of a finite set of parameters governing them, the two hypotheses 
and H) are said to be parametric. In detecting signals in white Gaussian noise, 
for instance, the only unknown quantity may be the spectral density of the noise, as 
in Sec. 8.1. With colored Gaussian noise, the only undetermined parameters may 
be the variances and the bandwidths of certain components. 

If, on the other hand, the very forms of the distributions of the noise are 
unknown, a finite number of parameters will not suffice to specify them, and the 
hypotheses are called nonparametric. Even Gaussian noise whose autocovariance 
function or spectral density is unknown gives rise to a nonparametric detection 
problem. More generally, it may be possible to describe the noise only qualitatively, 
as by saying. that its expected value is or that the probability density function 
of its amplitudes is an even function, the forms of its density functions remaining 
otherwise indeterminate. A class of probability density functions, restricted only in 
such a way and wanting values of an infinitude of parameters for their specification, 
is termed a nonparametric class. 

Parametric detection is a matter of choosing between composite hypotheses, 
under which the probability density functions of the data depend on only a finite 
number of unknown parameters. It can be handled by the methods introduced in 
Sec. 3.6. If prior probability density functions of the parameters under the two 
hypotheses are given, the numerator and the denominator of the likelihood ratio can 
be averaged with respect to them, as in (3-91), and the decision is based on the value 
of the resulting ratio. If no prior probability density functions are to be had, least 
favorable ones can be postulated, or the principle of maximum likelihood can be 
applied. When only a finite number of parameters are unknown, their values can be 
estimated separately under each hypothesis, and a likelihood ratio can be determined 
by substituting. the estimates into the density functions. 

Detection is particularly simple when the unknown parameters of the noise 
can be estimated independently of the presence or absence of a signal. Suppose that 
a coherent communication system transmits pulses of known form that are received 
in white Gaussian noise whose spectral density N is unknown, as might be the case 
if a jammer were trying to interfere by emitting broadband noise. When the pulses 
are orthogonal and are received with equal energies, the optimum receiver does not 
even depend on the value of N. When on-oflT pulses /(/) and are used for sending 
binary messages, the output of a filter matched to the pulse /(?) must be compared 
with a decision level depending on N, but the minimum attainable probability of 
error is unaffected by prior ignorance of jV, for N can in principle be measured 
independently. (See Problem 8-1.) In practice JV cannot be determined exactly, and 
in Sec. 8.1 we described how an estimate of N can be utilized instead. 
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When the signal also depends on unknown parameters or when more than 
simply the power level of the noise is unknown, the problem of estimating the un- 
known parameters, designing the detection system, and calculating the false-alarm 
and detection probabilities is much more complicated. Each such problem must be 
attacked individually, and little of general applicability can be said. 

8-2.2 Nonparametric Receivers 

Besides the ever-present Gaussian thermal noise, input processes that cannot easily 
be described mathematically sometimes perturb a receiver. Randomly occurring im- 
pulses due to lightning, sparks in ignition systems, or faulty connections and switches 
may interfere with communications. A model of such impulse noise based on the 
simplest assumptions provides even first-order probability density functions only at 
the cost of some difficult calculations [GiI60], [Yue78], and to obtain joint probability 
density functions of higher order seems hardly possible. Underwater-sound receivers 
pick up sporadic biological noise, strange croakings and cracklings that are difficult 
to describe statistically, yet impede the detection of weak signals. Such extraneous 
noise may not be stationary over a long enough period to allow empirical distribu- 
tions to be measured with any precision. When neither theory nor experiment is 
able to furnish detailed statistics, the designer of a detection system must give up 
characterizing the noise by the usual array of probability density functions and must 
face the problem of how best to choose between nonparametric hypotheses. 

With the distributions of the noise unknown, receivers can no longer be de- 
signed to meet a Bayes criterion. Although it may at times be sensible to admit 
prior probability density functions of a finite number of unknown parameters, it 
can hardly be meaningful to postulate prior distributions of the probability density 
functions po{v) and p\(v) themselves. There is no way to determine the average risk 
of a detection strategy with respect to a nonparametric class of distributions and 
hence no way of saying that one detector is better than another in the Bayes sense. 

Of the Neyraan-Pearson criterion all that is left is the directive to attain a 
specified false-alarm probability, and even this cannot always be achieved. A re- 
ceiver whose false-alarm probability is the same for all noise distributions in a given 
nonparametric class is said to be a nonparametric, distribution-free, or constant-false- 
alarm-rate (CFAR) receiver. When the alternative hypothesis H\ is nonparametric, 
there is no way to specify an average probability of detection and hence no way to 
maximize one. 

If no distributions of the noise are available, nothing can be said about the 
correlations among values of the input at different times. How they should best be 
combined becomes uncertain, and matched filtering of the kind we described earlier 
cannot arise from a theory of nonparametric detection. The input must, however, 
be filtered in some way that will favor the signals to be detected by, for instance, 
removing noise of frequencies outside the spectral band that the signals are expected 
to occupy. We shall suppose this to have been done by prefiltering the input. When 
the signals to be detected are narrowband signals of unknown phase, the output of 
the filter may have been rectified as well. 
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It is customary in studies of nonparametric detection to assume that this pre- 
filtered and possibly rectified input is sampled at times far enough apart that the sam- 
ples are at least approximately statistically independent. Alternatively, and more con- 
veniently, one supposes that there are M independent inputs ^-(0, k ~ 1,2, ... , M, 
each of which is processed in the manner described in Sec. 4.4. The receiver bases 
its decisions on the M data 

Sj = ffo(0], j = 1,2, ...,M. 

These data are statistically independent, and as in Sec. 4.4 they are assumed to be 
statistically homogeneous: under each hypothesis all M of the data have identical 
probability density functions. It is on the basis of these M data that the receiver is 
to decide whether the signal is present or absent. 

A restricted approach that has been taken assumes that most of the time the 
data g[,g2, ... >gM have known— or nominal— probability density functions fo(g) 
and/i(g) under hypotheses Hq and H\, respectively. There are, however, certain 
known probabilities e and e) that their probability density functions are instead, 
say, ho(g) or h\{g), so that the overall probability density functions of the data are 
actually 

Pi(g) = (1 ~ B ,)Mg) + eMgl i = 0, 1. 

One seeks the distributions h (g) and h\{g) that are least favorable in the sense that 
with the concomitant likelihood-ratio receiver, the Bayes cost is maximum, or in 
the sense that for a preassigned false-alarm probability, the probability of detection 
attained by the Neyman-Pearson receiver is minimum. The decision strategy is then 
said to be robust. The difficult problem of applying this concept to signal detection 
has engendered a considerable literature, which was reviewed by Kassam and Poor 
[Kas85] and by Kazakos and Papantoni-Kazakos [Kaz90, pp. 154-97]. We deem 
that topic too specialized for this book, and we turn instead to a brief survey of 
what has been done when not even nominal density functions for the data can be 
postulated. It is this subject that properly goes under the name of nonparametric 
detection. 

The designer of a nonparametric receiver attempts to exploit the differences, 
often only qualitative, between the characteristics of the input under the two hy- 
potheses (H ) "signal absent" and (H { ) "signal present." The values of the input, 
for instance, might be larger on the average, or more often positive, when a signal 
is present than when it is not. The designer seeks a receiver whose probability of 
detecting any of the expected signals is greater than the largest possible value of 
the false-alarm probability; such a receiver is said to be unbiased. Because the data 
are assumed statistically homogeneous, the receiver strategy should be invariant to 
permutations of the samples gk of the input. The already extensive development of 
nonparametric tests of hypotheses, described in books by Fraser [Fra57], Lehmann 
[Leh59], and Kendall and Stuart [Ken61, pp. 465-512], has guided the search for 
receivers with such properties as invariance and absence of bias. Our exposition will 
follow the surveys by Caiiyle and Thomas [Car64] and Carlyle [Car68]. A detailed 
treatment of this topic is to be found in the book by Gibson and Melsa [Gib75], and 
a bibliography was published by Kassam [Kas80]. 
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The performance of each nonparametric receiver will be compared with that 
of a standard receiver, which is optimum when the data g\,gi, ... ,gM are indepen- 
dently Gaussian random variables. They have expected values zero under hypothesis 
Ho that no signal is present. Under hypothesis H\ they have, by our assumption 
of statistical homogeneity, a common expected value A. This standard receiver is 
therefore one that bases its decisions on the sum 

M 

G=2> (8-19) 
/=] 

of the data; we call it the Neyman-Pearson receiver. Each new receiver will be 
compared with this one under the condition that both attain the same reliability 
(Qo, Qd), where Q is the false-alarm probability and Qd the probability of detecting 
the same signal. When the data gj are Gaussian random variables, the false-alarm 
and detection probabilities are as usual given by 

Qo = erfcx, Qd = erfc(x — D-JM), 

where 

D> = *'> - Ha)] = 4 (8-20) 
Var g cr 

is the effective signal-to-noise ratio for each datum gj as defined in (4-66); ct 2 = 
Varo g> and Varo g is the variance of each datum under hypothesis Ho. 

Because of the difficulty of calculating the performance of the Neyman-Pearson 
receiver for many kinds of non-Gaussian noise and the difficulty of calculating that 
of certain nonparametric receivers afflicted by any kind of noise for finite M, the 
analyst must usually resort to assuming M » 1 and comparing receivers on the 
basis of their asymptotic relative efficiency. This concept was introduced in Sec. 4.4 
and should now be reviewed. 



8.2.3 The t-Tost 

One approach to nonparametric detection is to pretend that the noise has a specific 
statistical structure, such as the Gaussian, to design the receiver on that basis, and 
finally to evaluate its performance for noise different from what was assumed. As an 
illustration we suppose that the signal components of the data g = {g\, g2, ... , gin) 
are equal to some unknown value A. The expected value of the noise components is 
taken to be zero, and the signal and noise are assumed to combine additively. Then 
the expected values of the samples are 

E(gj\H o ) = 0, E{ Sj \Hy) = A. 

As the probability density functions of the g/s are unavailable, their variances are 
unknown. 

If we pretend that the noise is Gaussian with expected value and unknown 
variance a 2 , we can derive a detection procedure by the method of maximum likeli- 
hood (Sec. 3.6.4). The probability density functions of the samples gj under the two 
hypotheses are taken as 
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Mg) = (2tt<t 2 T M/2 exp 



■M 



^ 2a 2 



Pi(g) = (2™ 2 r M/2 exp 



l ;=i 



2(T 2 



The value of a 2 is estimated under hypothesis H by maximizing the density function 
Po(g); call this estimate a%. Estimates of both A and a 2 under Wi are found by 
similarly maximizing Pi (g); call the results A and erf. When they are substituted into 
Poig) and pi(g), respectively, and the likelihood ratio is formed, it is found to be 
given simply by (vo/(J\) M , where 

i m i m , m 



CTi 



= -Yi gj -A)\ A = T Y 



Comparing this likelihood ratio with a certain decision level is equivalent to com- 
paring the statistic 



t - 



M X/2 A 



s 2 = 



M 



M - 1 



(8-21) 



with some other decision level / , as one can show with a little algebra. Here A is 
the sample mean and s the sample standard deviation of the data. The quantity s 2 
appearing in (8-21) is an estimate of the variance a 2 that happens to be unbiased 
under each hypothesis: 

E(s 2 \H ) = E(s 2 \Hi) = a 2 . 

The statistic in (8-21) is known to statisticians as Student's t statistic. If the 
components of the signal are expected to be positive, A > 0, hypothesis H\ is chosen 
when t exceeds a decision level / ; this is known as the one-sided t-test. If they may 
be either positive or negative, one requires the absolute value \t\ to surpass another 
level to, this is the two-sided t-test. If the noise is known to be Gaussian, the level 
to or t Q providing a specified false-alarm probability Q Q can be obtained from tables 
of Student's /-distribution, which are available in most statistical handbooks. The 
probability of detecting a signal by a one-sided /-test can be determined from tables 
of the noncentral / distribution [Res57]. The false-alarm and detection probabilities 
for the two-sided /-test can be reduced to those for the central and noncentral F 
distributions, respectively [HeI85a]. If the noise is not Gaussian or if its first-order 
probability density function is unknown, the proper setting of the decision level 
cannot be found from the tables; and with the decision level set as for Gaussian 
noise, the false-alarm probability may exceed the specified value if the noise actually 
has some other distribution. 

When the number M of samples is very large, the distribution of the statistic / 
is approximately Gaussian, by virtue of the central limit theorem, for any ordinary 
distribution of the noise with expected value zero. The variance of / is approximately 
1, and the decision levels / and /q for a preassigned false-alarm probability are given 
by the equations 

Qo * erfc / , Qo « 2 erfc t^. 
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In this limit M » 1 the false-alarm probability is the same for all such noise distri- 
butions, and the receiver is said to be asymptotically nonparametric. The probability 
of detection can also be estimated in terms of the Gaussian distribution. When 
M is very large, the sample standard deviation is nearly equal to the true standard 
deviation of the noise, and the expected value of the statistic t becomes 

E (,\ H X ) m = D-JM 

when the sample values of the signal are all equal to A; D 1 = A 2 /a 2 as in (8-20). 
The variance of t is approximately equal to 1 under hypothesis H\ as well. The 
detection probabilities are then 

Q d * erfc(*o - dJM), 

Q d « erfc(4 ~ £>JM) + erfc(^ + Dy/M) 

for the one- and two-sided Mests, respectively. 

In this limit of a very large number M of samples, the receiver based on the t- 
test performs as well as the standard Neyman-Pearson receiver defined in Sec. 8.2.2. 
The asymptotic relative efficiency of the f-test receiver with respect to that standard 
receiver is equal to 1. It should be borne in mind, however, that the asymptotic 
relative efficiency provides a valid basis of comparison only when it is expected 
that the receivers will actually utilize a very large number of independent samples 
gj. If the number M is small, the receivers may behave quite differently from the 
predictions of the asymptotic theory. As we have seen, the Mest receiver is not 
nonparametric for finite numbers M of data; its false-alarm probability for fixed 
decision level to or to depends on the true probability density function of the noise. 

8.2.4 The Sign Test 

The example of the J -test receiver shows that a receiver utilizing the unmodified 
amplitudes of the samples gj is unlikely ever to be nonparametric for finite M. 
There are two ways to avoid basing the decisions on those amplitudes. One is to 
discard all but the signs of the samples; the other is to arrange the samples in order 
of their amplitudes and use their ranks in this arrangement. We shall first describe 
a receiver that works only with the signs of the samples gj. 

If the noise is as often positive as negative, there will in the absence of a signal 
usually be nearly as many positive samples as negative. If the signal is known to be 
positive, therefore, a preponderance of positive samples will lead one to suspect its 
presence. The receiver can simply count the number n+ of samples' that are positive, 
choosing hypothesis H\ when »+ exceeds a certain decision level »o- The statistic «+ 
is invariant to permutations of the data {gj}. Such a receiver is said to carry out the 
sign test. 

The sign-test receiver is nonparametric over the class of probability density 
functions of the noise for which the probability of a positive noise value equals 5, 
Pr(g > 0| Ho) = 5. We are in effect assuming that the median of the noise is known 
and has been subtracted from all the data. The probability of there being more than 
«o positive samples under hypothesis Hq is given by the binomial distribution, 
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<2o = Pr(« + > Bo | Ho) = T M £ {^\ 

independently of the true form of the probability density function of the noise. For 
a fixed number M, however, only certain values of the false-alarm probability are 
accessible, and Q cannot be less than 2~ M if signals are to be detected at all. 

If one wishes the sign-test receiver to achieve an arbitrary false-alarm proba- 
bility Qq, it is necessary to introduce randomization as described in Sec. 1.2.5. If 
the number n+ of positive signs exceeds no, the receiver chooses hypothesis H\. If 
n+ = m , it chooses H\ with probability /, and the false-alarm probability is 



Qo = 2 



'Chip 



(8-22) 



as in (1-57). The procedure described there enables one to determine the values of 
no and / required for a preassigned value of Qq. 

Under our assumption of statistical homogeneity, the probability density func- 
tion of each sample gj under hypothesis //, will be the same for all; let us denote it 
by p](g). The probability of detecting the signal is then also given in terms of the 
binomial distribution, and as in (1-58), 



(8-23) 



p = Pr(g > 0| Hi) = 



P\(g) dg. 



If p > \ for all possible signals, the sign-test receiver is unbiased. 

The simplest way to compare the performance of the sign-test receiver with that 
of the Neyman-Pearson receiver is to determine their asymptotic relative efficiency 
(a.r.e.) for a number of different possible distributions of the noise. We shall show 
that when signal and noise are additive, 

P](g) = Po(g -A), A> 0, (8-24) 

it is given by 

a.r.e. = 4u 2 [p Q (0)]\ a 2 = Var g, (8-25) 

where p (g) is the probability density function of the datum g under hypothesis Ho 
that no signal is present. Thus when the data are Gaussian random variables, 

Po(g) = <rff 1/2ff \ (8-26) 

v2'n , o" i 

a.r.e. = 2/u = 0.637 [Hod56]. Naturally, the Neyman-Pearson receiver is the better 
one for Gaussian noise. When the noise has a bilateral exponential distribution, 

Po(g) = i*" 1 * 1 , cr 2 = 2, (8-27) 

a.r.e. = 2. Kanefsky and Thomas [Kan65] calculated the asymptotic relative effi- 
ciency of these receivers for a number of other probability density functions po(g). 
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They found that if the density function of the noise is either highly asymmetrical 
or very much more peaked than the Gaussian, as seems to be the case with certain 
kinds of impulse noise, the asymptotic relative efficiency exceeds 1 and the sign-test 
receiver is superior to the Neyman-Pearson receiver, which bases its decisions on 
the sum G of the data as in (8-19). 

The asymptotic relative efficiency can be determined by (4-67), 

D 1 

a.r.e. = lim -|, (8-28) 

where D\ and D\ are the effective signal-to-noise ratios of the two receivers when the 
input signal strength equals A. As in (8-20) D\ = A 2 /<r 2 for the Neyman-Pearson 
receiver. 

The sign-test receiver is based on the statistic 

M 

where U(-) is the unit step function, for which 

P\{g)dg. 

When we assume as in (8-24) that signal and noise are additive, 

/■OO fCO rO 

P = Po(g~A)dg = p (x)dx - \ + po(x)dx 

J0 J-A J-A 

* \ + A Po (0), 
provided that po(0) f and A is small. Now 

Var U(g) = E{[U(g)] 2 \ H ] - {E[U{g)\ H Q f = !-!=>, 
and the effective signal-to-noise ratio of the datum U(g) is 

D\ = 4^ 2 [^ (0)3 2 

by the definition in (8-20). Substituting into (8-28), we obtain the asymptotic relative 
efficiency as given in (8-25). 

Unfortunately the asymptotic relative efficiency is an unreliable measure of 
the performance of the sign-test receiver unless the number M of samples is very 
large. In Figs. 8-4 and 8-5 we plot in decibels the ratio of the input energy-to-noise 
ratio required by the Neyman-Pearson receiver (S\) to that required by the sign-test 
receiver (Sj) in order to achieve the same reliability (Qo, Qd). We set Qo = 10~ 3 and 
10 -6 and Qd ~ 0.99. For Fig. 8-4 it was assumed that the noise is Gaussian, and as 
M — * oo the ratio plotted approaches 10 log 10 (2/ir) = -1.9612, albeit slowly. Figure 
8-5 displays the same ratio for noise with the bilateral exponential distribution of 
(8-27), and as M — * oo it approaches 10 log 10 2 = 3.0103. Again the approach is 
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Figure 8-4. Ratio in decibels of energy-to-noise ratios to attain reliability (go, Qd) 
by sign-test receiver and Neyman-Pearson receiver; Gaussian noise. Q ( / = 0.99; 
curves are indexed with the value of Qq. 



very slow. A calculation by the saddlepoint approximation showed that at M = 10 5 
the relative efficiency is still only 2.91 dB for Qq = 10~ 6 , Q d = 0.99. 

For numbers M of data less than about 100, however, the ratio is rather less 
than the limiting value. For noise with the bilateral exponential distribution, Fig. 8-5 
shows that the Neyman-Pearson receiver is even superior to the sign-test receiver 
when M < 60 for Q Q = 10~ 3 and when M < 120 for Q Q = 10~ 6 , a conclusion oppo- 
site to that indicated by the asymptotic relative efficiency. Comparing receivers on 
the basis of their asymptotic relative efficiency is unreliable unless the number M of 
terms summed is very large. 

These curves were computed in the following way. For a given number M 
of samples and a given false-alarm probability go, we first determined n and / 
from (8-22) and then by Newton's method solved (8-23) for the probability p. From 
the value of p we determined the input signal amplitude A 2 required to attain the 
detection probability Q d with the sign-test receiver. For Gaussian noise of unit 
variance, 

p = 1 - erfc A 2 , (8-29) 
and for the bilateral exponential distribution, by (8-27), 

p - 1 - \ expM 2 ). (8-30) 

With Gaussian noise the statistic G in (8-19) has expected value MA\ and 
variance Ma 2 = M under hypothesis H u with A x determined by 

Qo - erfc x, Q (l = erfc(-y) = erfc(.v - A^4m\. 
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Figure 8-5. Ratio in decibels of energy-to-noise ratios to attain reliability ({?o, Qd) 
by sign-test receiver and Neyman-Pearson receiver; bilateral exponential noise dis- 
tribution. Qd = 0.99; curves are indexed with the value of Qq. 

in which x and y were found from tables of the error-function integral. Then 

a - x + y 



The ratio of the required input energy-to-noise ratios is A\/A\, which is plotted in 
decibels in Fig. 8-4. 

When the noise distribution is bilateral exponential, it is necessary to calculate 
the density function po(G) of the statistic G in (8-19) by in effect convolving the 
density function in (8-27) M - 1 times. The moment-generating function of the 
distribution of G is the bilateral Laplace transform 

h(z) = E(e- Gz \H ) = (l-z 2 r M , 

and the density function po(G) was found by evaluating the contour integral 



r 2\-M e G: 



_ dz 



-1 < c < 1, 



r .„ (1 - z2) " > *r 

by the residue theorem. The result was then integrated to determine 

Qo= fp (G)dG; 
see also [Joh70, Ch. 23]. Furthermore, the probability of detection is 



(8-31) 
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Qd = \ pq(G - MAO dG = p Q (G) c/G, G\ = G - MA\. (8-32) 

The values of G Q and (?, were obtained by solving (8-31) and (8-32) by Newton's 
method, and from them the input amplitude A\ was determined from 

A] = ^2i. 
1 M 

The ratio a]/a\ is plotted in decibels in Fig. 8-5. 

This method was utilized only for M < 125, and the jagged curves shown in 
the figures appeared, the sawteeth resulting from the jumps in the integer decision 
level «o as M increases. For M > 125 the serrations of the exact curves became too 
fine to plot, and the numbers of terms that needed to be summed when computing 
(8-32) became so large that round-off error began to introduce inaccuracy. The 
calculation of the probability p was instead based on a contour integral for the 
cumulative binomial distribution in terms of its probability generating function 

h{z) = (pz+q) M , q = \-p, 
with which we find as in (5-45) 



ci!;\p) = Pr(k>n\p) = 



~ z-"h(z) dz 
c + 



Z7T S7' (8 " 33) 



where k is a binomial random variable; the contour C+ is a circle centered at the 
origin and enclosing the point z - 1. The integral in (8-33) can be regarded as 
providing the quantity q);\p) as a function of a continuous variable n, of which 
the false-alarm and detection probabilities in (8-22) and (8-23) are polygonal ap- 
proximations formed by connecting the values of q { f p(p) at integral values of n by 
straight lines. When M » 1, the polygon lies close to the smooth curve, and we do 
not make much error if we solve (8-33) with p = q = \ for the value of n making 
c /n'({) ~ Qo, and if we then use that value of n, usually nonintegral, to determine 
the value of/; for which q\;\p) = Q d . The latter computation was carried out by the 
secant method. From the value of p the signal amplitude A 2 was again determined 
as in (8-30). For bilateral-exponential noise the cumulative distribution of the statis- 
tic G needed to determine Q<i(A\) was computed by the contour-integration method 
of Sec. 5.2 [Hel89a]. The ratios a]/a\ in decibels plotted in Figs. 8-4 and 8-5 then 
became smooth curves, which can be considered as approximations to the too finely 
serrated exact curves for M > 125. 

From this description it is seen how one can calculate similar performance 
curves for any kind of noise distribution, provided that one can work out the cu- 
mulative distribution of the statistic G in (8-19). Contour-integration methods of 
the type introduced in Sec. 5.2 will generally be applicable if one can calculate the 
moment-generating function of the datum gj under the postulated distribution. 

B.2.S Rank Tests 

If one knows more about the noise than what was assumed in Sec. 8.2.4, one can 
expect to detect the signal more efficiently. The class of even noise density functions 
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Po(g) - Po(-g) (8-34) 

is included in the class for which Pr(g > 0| Hq) = 5, but is smaller. For this re- 
stricted class of noise distributions the signed rank test is nonparametric and is 
superior to the sign test. 

When (8-34) holds and when the signal components of the data gj are positive, 
not only are there likely to be more positive samples gj than negative when a signal 
is present, but the sample values of large absolute value are more likely to be positive 
than negative. This observation suggests a more elaborate and more efficient test 
based on ranking the data {gj} in accordance with their absolute values. Again we 
assume statistical independence and homogeneity of the M data. The signal adds a 
constant amplitude A to the data, and under hypothesis H\ the probability density 
function of each datum g, has the form in (8-24). The test treats all samples alike 
and is invariant to their permutation. 

Once all M samples have been received, they are arranged according to their 
absolute values, 

IgJ < \sh\ < < \Si M l 
where i\, 12, ... , im is a permutation of all the integers from 1 to M. The A:th sample 
in this arrangement is assigned a "rank" equal to k. The receiver adds the ranks of 
those samples that are positive, forming what is known as the Wilcoxon signed rank 
statistic 

M 

r = £ *£/(&), (8-35) 
k=\ 

where again U( • ) is the unit step function. If the statistic r exceeds a decision level 
ro, hypothesis H\ is chosen. The statistic can also be written as 

MM M j 

which is more easily implemented in an electronic receiver [Car64]. 

We pause to demonstrate the identity of (8-35) and the rightmost double sum 
in (8-36). Because both these are invariant to permutations of the data, we can 
assume that the data are already ranked: 

\gi\<[gi\<-~ <\gul (8-37) 
Now consider the yth term in the right-hand sum in (8-36): 

tj = ^V(gi+gj). 

If gj is positive, gi + gj is positive because \g, \ < \gj\, and U(g-, + gj) = 1 for all i, 
whereupon tj = j. If gj < 0, on the other hand, g,- + gj < for all /, and = 0. 
Therefore, 

tj =JU(gj), 

which is the jth term of (8-35) when the data are arranged as in (8-37). Thus both 
forms of the statistic r yield the same value. 
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Because the probability density function of the noise is an even function, the 
probability under hypothesis H Q that a given datum g, is positive and its rank will 
appear in the sum r is equal to The expected value of the rank sum r is then 



M 



E{r\H ) = = \M{Af+ 1). 



(8-38) 



The false-alarm probability is obtained by examining all 2 M partitions of the 
integers from 1 to M into a primary set with m members (0 < m < M) and its 
complementary set with M - m members. Under hypothesis Ho all primary sets are 
equally likely to appear as terms in the rank sum (8-35). The false-alarm probability 
is then 2~ M times the number of primary sets whose sum exceeds the decision level 
ro- 

A gain, if an arbitrary false-alarm probability is desired, randomization must 
be introduced. The receiver decides that a signal is present whenever r > r ; when 
r = r , it chooses hypothesis H\ with probability /. The faise-alarm probability is 
then 



where h(r; M) is the number of primary sets of integers whose sum equals r, and 
R = M(M + l)/2 is the maximum possible rank sum. We have used the symmetry 
h(r; M) ~ h(R - r; M). The false-alarm probability is independent of the actual 
probability density function of the samples under hypothesis H Q , provided only that 
it is an even function, and for this class of noise distributions the rank-test receiver 
is nonparametric. 

The numbers h{r; M) obey the recurrent relations 



which are easily programmed. Tables facilitating the computation of Q and, in- 
versely, the determination of the decision level r are to be found in [Bic77, 
pp. 479-81] and [Wil73j. For M » 1 the tables are very lengthy, and the com- 
putation of Qo takes a long time and requires the storage of many intermediate 
numbers. 

The probabilities 




(8-39) 



h{k;M) = h{k;M~ 1), 

h(k; M) = h(k; M ~ 1) + h(k - M; M - 1), 



< k < M, 
M < k < R, 



A(0; 1) = h(l; 1) = 1, 




h{k; M) 



possess the probability-generating function 
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and the cumulative probabilities 



can instead be computed by saddlepoint integration (5-44). The "phase" of that 
integrand is 

M 

<D(z) = Y. ln + z p )-k\nz- ln(l - z) - M In 2. 

There exists a unique saddlepoint zo in < Re z < 1, which is the root of 

O'OO = V _ * - * = o. (8-40) 

wz -l+z/ ? zz-l 

This can quickly be solved by Newton's method. The curve C_ in (5-44) is con- 
veniently taken as the straight vertical chord of the unit circle passing through the 
saddlepoint zq, combined with the portion C of the unit circle to the left of the 
chord. When M is on the order of 30 or more, the contribution of the chord domi- 
nates. On C the integrand has many zeros, and that part of the path of integration 
contributes negligibly to qj?. The integration along the chord can be carried out by 
the trapezoidal rule as described in Sec. 5.2.2, and when M > 30, the stopping rule 
given there cuts off the numerical integration before the unit circle is reached. The 
results of such a numerical saddlepoint integration have been found to have high 
relative accuracy, and in contrast to the recurrent method, computation time and 
storage requirements are nearly independent of M and k. 

As can be imagined, calculating the probability of detection attained by this 
receiver is extremely difficult, and one resorts to the asymptotic relative efficiency 
as a criterion for comparing it with other receivers. It is shown in Appendix G 
that relative to the Neyman-Pearson receiver the asymptotic relative efficiency of 
the rank-test receiver is 

a.r.e. = Ua^p^te)] 2 ^ = Var g, (8-41) 

when the signal and noise are additive as in (8-24). 

For Gaussian noise the asymptotic relative efficiency equals 3/ir = 0.955; for 
the bilateral exponential distribution in (8-27) it equals 3/2. If one minimizes 



[po(g)fdg 



under the constraints 



/■« poo roo 

po(g)dg = l, gpo(g)dg=0, g 2 po(g)dg =ct 2 , 

J— 00 J— DO J— 00 

one finds, within an arbitrary scaling factor, that the probability density function 
for which the asymptotic relative efficiency is minimum has the form 

U(l-g 2 ), 1*1*1, 



u 
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and the attendant asymptotic relative efficiency equals 108/125 = 0.864. 

If the signal may be either positive or negative, but maintains the same sign 
throughout — as it must under our assumption of statistical homogeneity — the re- 
ceiver can determine the rank sum r of the positive samples and the rank sum 
r' = \M{M + I) - r of the negative samples. If the larger of these exceeds a certain 
decision level r{,, the receiver asserts that a signal is present. With a positive signal 
the statistic r tends to be large; with a negative signal the statistic r' tends to be 
large. Tests such as these are known as Wilcoxon tests [WH45]. 

The false-alarm probability for this receiver equals 2~ M times twice the num- 
ber of primary sets whose sum exceeds ;{ when no randomization is involved. A 
randomized test chooses hypothesis H } when max(r, r 1 ) > 4; when max(>% ;•') = ri 
it chooses H\ with probability /. The false-alarm probability is then 

r' Q >\R, R = \M{M + 1). 

This Wilcoxon test is also nonparametric: its false-alarm probability is independent 
of the actual density function of the noise. 

A radar receiver may be required to detect a train of coherent narrowband 
pulses having a common phase i|/ in each input, and as usual that phase is un- 
known and may lie anywhere in (0, 2w). The input is passed through a narrowband 
filter matched to the signal Re[F(t) exp iClt], and at an appropriate time the com- 
plex envelope of its output is sampled to yield, for the jth interpuise interval, a 
complex sample v cj + iv sj , 1 <j < M. The receiver can then rank the sequences 
{v c \,v cli ... ,v cM ) and (v sU v X 2, ... ,v sM ) individually by their absolute values, so 
that 

\v ch \ < \v eh \ < ■■■ < \v ci J, \v sh | < \v sh \ < ■■■ < \v 3ju l 

where (i u h, ... t i\f) and (71,72, ... Jm) are permutations of the integers from 1 to 
M. The rank sums 

M 

rc = J *t/frc4). r' c = \M{M + 1) - r c , 

M 

r s = X kU(v sj J, ^ = \M{M + 1) - r s , 

k-\ 

are formed. The receiver then compares the statistic r'c" + r'J 2 with a decision level 
r , where r c - max(r c , r' c ), r'£ = max(r s> r' s ); if the level is exceeded, a signal is 
declared present. 

If no signal is present, the rank sums will all be roughly equal to their expected 
value \M{M + 1); but if a signal with some arbitrary phase 4» is present, one or the 
other or both outputs of the sampler will have a preponderance of positive or negative 
values, and the statistic just defined will, most likely, be larger. If the phases of the 
successive signals are variable and independently random, however, the test will fail. 



<2 = r<"-'> 



fh(ri;M)+ £ h(r\ M) 
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8.2.6 Receivers with a Reference input 



Detection is facilitated if the receiver can obtain a separate set of inputs exhibiting 
the same type of noise as what corrupts the signals, but themselves free of any 
signals, much as in the CFAR receiver treated in Sec. 8.1. Let us suppose that we 
have L such reference inputs that provide independent samples h\, hz, ... , h L that 
contain only noise, besides the M independent samples g\, gi, ... , gM that may or 
may not contain a signal. 

Adapting from statistics what is variously known as the Wilcoxon two-sample 
test [Wil45] and the Mann-Whitney test [Man47], Capon [Cap59] analyzed a receiver 
that detects a constant signal causing the samples g\,gi, ■•• ,gM to be generally larger 
than the reference samples h\, hi, ... , hi. This signal might be a coherent signal that 
is always positive or a quasiharmonic signal that has been passed, with its attendant 
noise, through a rectifier before sampling. The receiver forms all LM possible pairs 
hj) of samples from the two inputs. It counts the number of pairs in which g t 
exceeds k; to form the statistic 



If V exceeds a decision level Vo, the receiver decides that a signal is present. Alter- 
natively, the receiver can arrange all (M + L) samples.in order of their values. The 
position of a sample in this ordering, starting from the smallest, is its rank. The 
sum W of the ranks of the M samples g t is linearly related to the statistic V and 
can be used instead. This receiver is nonparametric for all noise inputs for which the 
L samples hj have the same probability density function as the M samples g t under 
hypothesis H and are statistically independent of them. Tables that can be used for 
setting the decision level Vo have been published by Fix and Hodges [Fix55] and by 
Wilcoxon et al. [Wil73]. 

Under hypothesis Ho the probability generating function of the rank-sum statis- 
tic V is 



[Ken61, p. 494]. The probability distribution of V can be calculated from this 
probability-generating function by saddlepoint integration as was shown for the 
signed-rank statistic in (8-39) and (8-40). For M and L greater than about 30, 
the results are very accurate. For smaller values of M and L the algorithm given by 
Harding [Har84] is efficient, but the computation time and the storage it requires 
rise rapidly as M and L increase. 

When both numbers L and M of samples are very large, the statistic V is ap- 
proximately Gaussian. Capon [Cap59] showed that if the ratio L/M remains finite as 
L and M increase beyond all bounds and the signal strength vanishes, the asymptotic 
relative efficiency of this receiver relative to the Neyman-Pearson receiver is given 
by the expression in (8-41). When the noise is Gaussian, the asymptotic relative 
efficiency is 3/ir = 0.955; and whatever the noise distribution, the asymptotic rela- 
tive efficiency cannot fall below 108/125 = 0.864. The receiver is thus less affected 
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than the Neyman-Pearson receiver by deviations of the noise distribution from the 
Gaussian. 

8.2.7 Two-input Systems 

The nonparametric receivers considered so far have required a certain coherence in 
the signals, in the sense that the signals must consistently drive the sample values to- 
ward more positive or more negative amplitudes. This coherence, however, may not 
always exist. If the signal is a random process taking on both positive and negative 
values during the observation or if the signals have random phases, the samples may 
have various signs, and it will be impossible to distinguish inputs containing signals 
from pure noise simply by looking at the signs of the samples. 

The detection of stochastic signals, which, like random noise, can be described 
only by means of a collection of probability density functions, will be treated in 
Chapter 1 1 by the parametric methods developed heretofore. At present we shall 
only mention that such signals arise in multipath communications, sonar, and radio 
astronomy. With these signals the inputs to the receiver may have expected value 
zero under both hypotheses, the principal differences being a greater power level 
and, possibly, a different spectral density under hypothesis H\ from what is observed 
with a signal absent. If the statistics of the noise are unknown, it is difficult to take 
advantage of distinctions such as these. 

If a reference noise input is available and known to be free of any signals, one 
might set up a receiver based on the Mann-Whitney-Wilcoxon test just described. 
The outputs of the prefilter would be applied to a quadratic rectifier before sampling, 
and one could expect the samples of the input being tested to be mostly larger than 
those of the reference input when a signal is present. 

If the noise arises mainly in the receiver or nearby, it may be simpler to try 
picking up the signal with two receivers so placed that the signal components of the 
■inputs to each are the same, while the noise components are independently random. 
The presence of a signal is then indicated by a correlation between the two inputs 
that is absent when there is no signal. Let the samples of the prefiltered inputs 
of the two receivers be v u v 2 , ... , v M and ivi, w 2 , ... , w M . The samples in each 
set are supposed to be statistically independent among themselves and statistically 
homogeneous. 

The sample correlation coefficient 

M 

XOf; - w)(v; - v) 
1=1 



M M 

j=l J 

j M j M 

will tend to be small when there is no signal present, and a receiver might compare 
r with a decision level r » choosing hypothesis H\ when r is the larger. If there 
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is a possibility of a constant, unknown phase shift between the signals at the two 
receivers, the absolute value |r| should be compared with a level r'a, hypothesis H 
being chosen when \r\ < r§. These receivers are only asymptotically nonparametric; 
for a finite number of samples of each input the decision levels ro and ro for a 
preassigned false-alarm probability will depend somewhat on the true distribution 
of the noise. 

In order to eliminate this dependence on the distributions, the receiver should 
work with the signs or the ranks of the samples. The simplest system is the polarity 
coincidence correlator, which has been analyzed by Wolff, Thomas, and Williams 
[Wol62], Ekre [Ekr63], and Kanefsky and Thomas [Kan65]. It bases its decisions on 
the signs of the products of the samples, 

M 

v = £ U{vm), 

where U{ ■ ) is again the unit step function. If this statistic exceeds a decision level, 
the signs of the two sets of samples have a positive correlation, and the presence of 
a signal is indicated. This receiver is nonparametric for inputs that under hypothesis 
Hq are independent and have even probability density functions po(v) - po(~v) and 
Po(w) = po(-w). 

If the signals and the noise are independent Gaussian random processes, the 
optimum receiver based on the Neyman-Pearson criterion simply combines the sam- 
ples and adds the squares of their pairwise sums, comparing 

M 

W = ^Jv; + w,) 2 (8-42) 
/=i 

with a decision level. With respect to this detector the asymptotic relative, efficiency 
of the polarity coincidence correlator is 

a.r.e. = 2(q + o- 4 )^)] 4 , (8-43) 

where a 2 is the variance and q the fourth central moment of the noise. When the 
noise is Gaussian, this asymptotic relative efficiency equals 2/tt 2 = 0.202, hardly a 
promising result. Nevertheless, for noise with certain types of cuspidated distribu- 
tion, Kanefsky and Thomas [Kan65] have found this asymptotic relative efficiency 
to exceed l. They were careful to point out, however, that unless the number M 
of samples is huge and the signal-to-noise ratio infinitesimal, the use of the central 
limit theorem to evaluate the polarity coincidence correlator with the type of noise 
probability density functions they considered may seriously overestimate its relative 
efficiency. 

Problems 

8-1. A signal Af(t) of known shape, but unknown positive amplitude A, is received in 
white Gaussian noise of unknown spectral density N. A set of functions /i(f)>/2(0* ■•- > 
orthonormal among themselves and orthogonal to fit) over the observation interval 
(0, T), is determined, as by the Gram-Schmidt orthogonalization procedure described 
in Sec. 2.1 .3. The input v(t) is passed through n parallel filters matched to n of these 
functions J)(t) to obtain the statistics 
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vj = \ 7 fj{t)v{t)di, j = I, 2, ... , n. 
Jo 

■ Show that an unbiased estimator of the spectra] density N is 

? " 
N = ~S\vj, 
n 1 
j=i 

and that by taking n large enough the variance of this estimator can be made as small 
as desired. Propose a receiver, based on the estimate N and on the output of a filter 
matched to/(r), for detecting the signal 5(0, and show that if n is made large enough, 
this receiver is as reliable as one designed for detection in white noise of known spectral 
density N. 

8-2. A signal of known form s(t) = Af(t), but unknown positive amplitude A, is to be 
detected in Gaussian noise of autocovariance function §(t, u) = v 2 T)(t, u), of which 
t](/, w) is known, but the variance ct 2 is not; ^(0,0) ~ 1. As in Sec. 2.1.6, make 
Karhunen-Loeve expansions of the signal, the noise, and the input v(t). Arrange the 
eigenvalues \k of the kernel -r\(t, u) in descending order, and denote the associated 
eigenfunctions by fk(t). As data the n quantities 

vh = f fk(t)v{t)dt, k = l,2, 
Jo 

are to be used. Design a maximum-likelihood receiver to detect the signal on the basis 
of estimates of the amplitude A and the variance ct 2 . Show that if n is taken large 
enough, the reliability of this receiver will be as great as that of a receiver designed for 
detection in noise of known variance a 2 [Scb.71]. 

8-3. Work out the following alternative version of the solution of Problem 8-2. Write the 
input v(t) as 

v(t) = vi(t) + v 2 (t), < t < T, 

with v\(() defined as 




where g{t) is the solution of the integral equation 

/(/)= f i)(t,s)g(s)& z 0<i <T. 
Jo 

Show that v 2 {t) is statistically independent of V](t) and that its probability density 
functions are the same whether the signal is present or not. Derive its autocovariance 
function. Use Problem 6-7 to determine the unknown variance ct 2 from v 2 (t), and use 
this variance in setting the decision level for the detection statistic. What should this 
detection statistic be? 

8-4. Derive the zero-order saddlepoint approximations for the false-alarm and detection 
probabilities of the CFAR receiver from the moment-generating function h{z) in (8-8). 
Use the technique described in Sec. 5.3.3 to determine approximately the constant (i 
that yields a preassigned false-alarm probability Q = Pr(£/ > H Q ). Then use your 
saddlepoint approximation to check the detection probabilities Q,, plotted in Fig. 8-1 
in the range Qj > 0.9. 

8-5. Suppose that quasiharmonic signals 

s k (t) ~ A Re F(() expiitli + Z^), £ = 1,2, ... , n, 
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with independently random phases iff*, but with a common, though unknown amplitude 
A, are all either present or absent in a set of n successive inputs v k {t) to the receiver. 
The data on which their detection is to be based are the real and imaginary parts x k 
and y k of the output of a filter matched to the signal, 

xk + iy k = ( F*{t)V k (t)dt, k = 1,2, 
Jo 

where V k (i) is the complex envelope of the &th input. The noise is white and Gaussian 
of unknown spectral density. Work out under each hypothesis the maximum-likelihood 
estimates of the common variance of the n xt's and the n y k 's, determine the maximum- 
likelihood estimates of the phases i|ft and of the common amplitude A, and show that 
the maximum-likelihood detector is equivalent to one that compares the statistic 

h4+yZy /2 



with a suitable decision level. 
8-6. Show how the false-alarm and detection probabilities for the sign test, as given in (8-22) 
and (8-23), can be calculated by the saddlepoint approximation introduced in Sec. 5.3.2. 
Assuming that the number n of stages is a continuous variable, as described after (8-23), 
show how to determine the value of n required for the sign test to attain a preassigned 
false-alarm probability. By using these saddlepoint approximations, check the efficien- 
cies plotted in Fig. 8-4 at a number of values of M > 125. Show how to determine the 
false-alarm and detection probabilities of the Neyman-Pearson receiver by the saddle- 
point approximation when the noise has the bilateral exponential distribution in (8-27), 
and use your results to check the efficiencies plotted in Fig. 8-5 for several values of 
M > 125. 

8-7. The aim of this problem is to work out the asymptotic relative efficiency of the polar- 
ity coincidence correlator relative to the detector that is optimum when the data have 
Gaussian distributions. The inputs to two receivers are corrupted by statistically inde- 
pendent noise processes whose distributions, though unknown, are the same in both 
receivers. The signals, on the other hand, are identical in the two receivers, but they 
are random and differ from one observation interval to the next. Denoting the inputs 
to the two receivers by x,(t) and _y,(/), we can express them as 

x,(t) = Si (t) + n',(t), 

MO = s,(t) + «/'(/), 1 < / < M t 

under hypothesis H\. Under hypothesis H the signals Sj(t) are absent. Both re- 
ceivers process these M inputs in identically linear fashion to remove noise outside 
the frequency band of the signals, producing a total of 2M statistically homogeneous 
samples: 

v f = f[*,(OI, w t = fLwOO], i = 1, 2, ... , M. 

Under hypothesis Hq, "noise alone present," the v,-*$ and Wj's are statistically 
independent with a common probability density function po( • ), which is an even func- 
tion: po(v) - po(-v) and po(w) - po(-w). When the signals are present (hypothesis 
Hi), they add to the noise, so that the conditional probability density functions of the 
/th samples are po(v/ — s t ) and poiw-, - j,-), respectively; Si is the contribution of the 
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signal to each sample. The values jv of the signal samples are independently random 
with expected value zero and variance s 2 . 

(a) Show that if the noise is Gaussian, the optimum receiver forms the statistic W 
in (8-42) and compares it with a decision level Wo t choosing hypothesis Hi when 
W > Wq. Calculate the effective signal-to-noise ratio 

Var W 

of this receiver in terms of the signal variance s 1 , the noise variance a 2 , and the 
fourth moment q - E(nf) = E(n"*) of the noise. Remember that nj and n" are 
statistically independent for all i, 

(b) When the noise density function /?o(') is unknown, but an even function, the 
receiver will instead base its decision on the output 

M 

;=i 

of the polarity coincidence correlator; £/(• ) is the unit step function. Calculate the 
effective signal-to-noise ratio Dy of this statistic V in the limit in which M » I 
and s 2 <£ 1. In doing so, assume first that each s f is a fixed, known quantity, and 
write an expression for 

E[U(v iWi )\ H{\ - E[U(v iWi )} H ] 

in terms of /?o(0 and Sj. Then expand this in a power series in s it retaining only 
the term of lowest order, which will turn out to be proportional to sf. Replace sf 
by its expected value s 2 before continuing your calculation of Dy- 

From the results of parts (a) and (b), determine the asymptotic relative effi- 
ciency of the polarity coincidence correlator with respect to the Neyman-Pearson 
receiver. Evaluate it for signal and noise samples that have Gaussian distributions 
and bilateral exponential distributions (8-27). 
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9.1 THE SEQUENTIAL PROBABILITY RATIO TEST 

Let us consider a radar that is assigned to deciding about the presence or absence 
of a target at a specific distance and in a specific direction on the basis of M inputs 
Vk(i), \ <k < M. These inputs are acquired one after another during successive 
observation intervals, each initiated by the transmission of an energetic narrowband 
pulse. If a target is present, each input contains an echo signal having known form, 
but perhaps randomly varying parameters such as amplitude and phase. The M 
inputs Vk{t) possess the statistical homogeneity described in Sec. 4.4.1. A continuum 
of distances is in practice examined simultaneously, but this aspect we shall disregard 
for the time being. When the M pulses have been transmitted and the decision has 
been made, the beam of the antenna is shifted to a new azimuth and the procedure is 
repeated. When the receiver processes each set of M inputs as explained in Sec. 4.4, 
it is said to be executing a fixed-sample-size statistical test. The amount of time 
needed to search for targets in a certain portion of the sky is proportional to M. 

If the antenna is composed of an array of distinct transducers, as described in 
Sec. 4.3, its beam can be moved electronically by suitably altering the phase shifts in 
the lines from the transducers to the receiver. In this way the direction of the beam 
can be placed under the control of the radar observer or his electronic counterpart, 
and it is unnecessary always to transmit the same number of pulses in each direction. 
Instead the beam can be shifted to a new direction as soon as the receiver judges 
that it has information of sufficient quality and quantity to make a reliable decision 
about the presence or absence of targets at the current azimuth. The number of pulses 
transmitted in a given direction and of inputs processed before making a decision 
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then depends on the inputs themselves and is a random variable. A detection system 
that determines the number of its observations according to what it receives is said 
to operate sequentially. Sequential operation permits searching the sky more rapidly 
without diminishing the overall reliability of detection. 

Sequential detection will be introduced in the context of Sec. 4.4.1, as though 
a radar were searching for a target at a fixed distance. The inputs to the receiver are 
denoted as before by vj(t)J = 1, 2, ... , and the number of inputs processed before 
making a decision is no longer fixed, but random. Again the y'th input is 

vj(t) = nj{l), j = 1,2,..., < / < 7\ 

under hypothesis Ho and 

vj(t) =« y (o + ^(?;fly,e;,e") 

under hypothesis H\. The same assumptions about the noise nj{t) and about the 
signals s(t; aj, 6}, 6") are being made as in Sec. 4.4. The parameters Bj are in- 
dependently random from one input to another; the parameters 6" are invariable! 
Statistical homogeneity of the inputs is postulated. The receiver processes each input 
in such a way as to produce a single datum 

gj = fty(0], j = 1,2,.... 

Its probability density function under hypothesis H } is assumed to have been aver- 
aged over the values of the random parameters 6j of the signal. By virtue of our 
assumptions, the data gj are statistically independent. 

When, say, k inputs vj(t), 1 <j < k, have been received and processed to 
provide k data, or samples, g lt g 2 , ... , g k , the receiver is said to have reached the 
ktb stage of its operation. At each stage the receiver makes one of three decisions: 
(1) Hypothesis Hq is true; no signal is present, (2) Hypothesis Hi is true, and an 
echo signal of the specified class is present, or (3) Another pulse is to be transmitted, 
another input v k+] (t) acquired, and a new datum #t+i produced. If one of the first 
two decisions is made, the sequential test terminates; otherwise it continues through 
at least one more stage. How shall the decisions be made? 

A fully Bayesian approach might be attempted, taking into account the costs 
entailed by the decisions and the cost of acquiring each new datum, Prior prob- 
ability density functions of the parameters would be adopted. The data sets g = 
(g\> g2> ■■■ , gk, ... ) leading to each possible terminal decision determine a decom- 
position of the infinite-dimensional space of those sets g into regions R and Ri , and 
the boundaries between them must be laid in such a way as to minimize the average 
cost of operation. Such a minimization would manifestly be an exceedingly difficult 
mathematical problem. 

When the cost of acquiring each new statistically independent input Vk+\(t) 
and generating each new datum g k+l is invariable, and when the costs attending the 
various decisions are independent of the true values of the signal parameters, the 
choice among the three decisions (1), (2), and (3) can be based on the posterior 
probability 

Pr(jy,!g 1 ,g 2 ,...,afc) = Pr(tfilg ( * ) ) 
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of hypothesis Hu given the data g**' = {g\,g2 i ... , gk) at hand; Pr(// I 8 1 ") = 1 - 
Pr( J^i | g a> ). If that posterior probability is large enough, hypothesis H\ will be ac- 
cepted; if small enough, Hq; otherwise the system will proceed to the next stage 
k + I. Symbolically, there will be two constants a and (3, a > (J, such that 

Pr(ffi| g {k) ) 5: a => choose H u 

|3 < Pr(ffil g (ft) ) < a => take another observation, (9-1) 

Pr(#ilg ( *>)<p=> choose H . 

The posterior probability needed is 



Pr(F,|g t/c ') = 



_ S.Atf') ( } 

£o + {iA(gi*))' 

where P/Cg"') = P<(£i, #2> » £*) is the joint probability density function of the 
data g 1 *' = (g\, g2, ... , &t) under hypothesis / = 0, 1; £o and t,\ are the prior 
probabilities of hypotheses Hq and #i, respectively, and 

{g ] Pom 

is the likelihood ratio at stage k. Because the data are statistically independent, 

A(g<*>) = nA(g;) 

with 

the likelihood ratio for any datum g. It depends on the signal strength S and the 
set of invariable parameters 0", but not on the set of random parameters 0' as these 
were defined in Sec. 4.4. We suppose that the receiver is set up to detect a standard 
signal having strength S = S s and the set 8" = 6" of invariable parameters. 

Because the data are statistically independent and the cost of making an ob- 
servation and acquiring a new datum g is constant, the choice the receiver makes 
depends only on the posterior probability Pr(//j | g) of hypothesis Hi , g representing 
the data so far collected. It does not matter how many data have been collected, for 
they are a!! subsumed in that posterior probability. Given Pr(#il g), the prospect 
of the future behavior of the sequential procedure, its eventual outcome, and its 
eventual total cost appears the same to the receiver whether few or many data g are 
at hand. The decision levels and a in (9-1) are therefore independent of the stage 
k that the test has reached. 

Introducing the notation 

uj = In A( gj ) = \xi[Pi( gj ; S s , ^)/P ( gj )], (9-3) 

k 

U k = In A(g«*>) = £ «,- (9-4) 
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for the logarithms of the likelihood ratios, we can rewrite the inequalities (9-1) in 
the form 



Uk > a = In A =* choose H\ , 
b = In B < U k < a - In A => take another observation, 
U k < b ~ In B =* choose // , 



(9-5) 



where 



A = e" = 



5 



Sid -3) 



are the decision levels on the likelihood ratio A(g). (The notation A for the upper 
decision level should not be confused with the A we have often used to indicate the 
amplitude of a signal.) A test basing decisions of this kind on the current value of 
the likelihood ratio A(g ( *') or its logarithm U k is called a sequential probability ratio 
test. Its properties were extensively investigated by Wald [Wal47]. 

As the test proceeds through its successive stages, the value of the logarithm 
V of the likelihood ratio changes with the acquisition of each new datum g k : 



The point V is said to execute a discrete-time random walk on the real line. Because 
the data gj and hence the increments In A(gj) = uj of U are independent random 
variables, the process is said to be an independent-increment process [Hel91, p. 321], 
[Lar79, vol. I, p. 155]. The false-alarm probability Q (a, b) equals the probability 
under hypothesis H Q that the random variable U k crosses the upper "barrier" a = 
In A before passing below the lower barrier b ~ In B: 

Qo{a, b) - Pr(for some k: U k > a, b < Uj < a, j = 1, ... , k - \\ H ), 

and the detection probability Q c \{a, b) is likewise 

Q (l (a, b) = Pr(for some k: U k > a,b < Vj < a, j - I, ... ,k - -11 Hi). 

The latter depends on the strength or average strength of the signals, denoted by 
S , and on the set e" of invariable parameters. Wald has shown that this kind of 
test eventually terminates with probability 1. Calculating the probabilities Qa(a, b) 
and Q<t(a, b) in their dependence on the decision levels a and b requires solving an 
integral equation whose kernel is the probability density function of u ~ In A(#) 
under hypothesis H or H u as the case may be [Sam48], [Kem50]. To determine 
the levels a and b that minimize the average cost of operation is a difficult problem 
[BIa54, Ch. X]. - 



The sequential test can also be set u^hrsuch a way as to attain a particular 
reliability (Q , Q ds ), where Q ds (S s , 6f) isihe probability of detecting a signal hav- 
ing the standard strength S s and the/standard set Q'J of the invariable parameters. 
One picks the decision levels a ano/ b so that Q (a f b) ~ £>o and Q^a, b) = Q ds . 
This formulation is the counterpart of applying the Neyman-Pearson criterion in a 
fixed-sample-size test and dispenses with knowledge of the cost matrix C and the 
prior probabilities £ and £i of the two hypotheses. As aforesaid, computing the 
functions Q (a, b) and Q (i (a, b) and searching for the values of a and b can be ex- 
pected to be tedious. Wald cut through all these complexities by discovering simple 
approximations to those functions. They are based on the inequalities 



Uk = U k ~\ + u k> 



Uk = MPi{gklS s ,#J)/Po(gk)]. 
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A = e « < B=e b > 1— % (9-6) 

[Wal47, p. 41], which we shall now derive. 

At the nth stage the space R„ of the data 

% {n) = (g\,g2 gn) 

is divided by the decision rule (9-5) into three regions R%\ R'l", and R™: Rq' is the 
region of points g 1 " 1 leading to termination of the test with a decision for hypothesis 
Hq; Rf is the region of points leading to termination with decision for H\ ; and Ri 
is the region of points leading to the decision to take an (« + l)th observation. The 
probability of choosing hypothesis H\ at that stage is 

go"' = f , ,^0(8) d n g y d n % = dgi dg 2 ... dg n> 

when hypothesis Hq is true. When hypothesis H\ is true, this probability is 

Qf = [ s„ e'J) d"g = f (n A( g; s„ e' s ! )P (g) d"g 



Jr 



w Po(g)d"g = AQi? } 



(9-7) 



because A(g""; S s , 8") > A when hypothesis Hi is chosen. Summing over all stages 
n, we find 

0* > AQ , (9-8) 

whence the first inequality in (9-6). The second follows by considering in a similar 
manner the probabilities of choosing hypothesis H under the two hypotheses, and 
one finds 

l-fi*S^l-a) (9-9) 

because A(g ifll ; S s , B'J) < B for g"" e <\ 

Wald observed that when the number of stages is on the average very large, 
which in our context implies that the standard signal strength & is very small, the 
increments uj in the logarithm U of the likelihood ratio are on the average also 
very small. When the test terminates with selection of hypothesis Hi, therefore, the 
value of the likelihood ratio A(g""; S Si Q") in (9-2) will be only slightly greater than 
A, whereupon (9-7) and hence also (9-8) are nearly equalities. The same argument 
applies to (9-9), for when H is chosen, Afg 1 "'; S s , 8") is very close to B. Thus Wald 
established the approximate formulas for the decision levels a and b required for 
attaining the reliability (go, Qd s ): 

a = l„^,n[f], A = l„B»l„[i^] (9-.0) 
[Wal47, pp. 44-8]. 

It is only when the signals are weak and the average number of stages before 
decision is very large that it is profitable to adopt a sequential test at all. For strong 
signals the number of stages required will on the average be not much smaller than 
the number required by a fixed-sampie-size test with equal reliability (go, Q<i s ), and 
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the additional complexity of the sequential test is hardly worth implementing. Wald's 
approximations are valid under just those conditions when a sequential lest is likely 
to be advantageous. These are the same conditions under which we found in Sec. 4.4 
that the asymptotic relative efficiency yields a useful indication of the comparative 
effectiveness of two receivers or two decision strategies. We now turn to assessing the 
performance of the sequential probability ratio test under those same circumstances 
of weak signals and a large average number of observations required to attain a 
given reliability. The necessary formulas, which were derived by Wald [Wal47], were 
applied to sequential signal detection by Bussgang and Middieton [Bus55], Blasbalg 
[Bla57a], [BIa57b], and others. A comprehensive treatment of sequential analysis is 
to be found in the book by Ghosh [Gho70j. 

9.2 PERFORMANCE OF THE SEQUENTIAL TEST 

When the sequential probability ratio test has been established in the manner just 
described, it is of interest to assess its performance for signals having possibly dif- 
ferent strengths S and possibly different sets 0" of the invariable parameters from 
those of the standard signal for which the test has been set up. Because the aim 
of the sequential test is to reduce the number of samples needed on the average to 
attain a given reliability, one would like to know not only the probability Q (/ of 
detection, but also the expected number of stages through which the test must pass 
before making a decision, and these as functions of the actual signal strength S 
and the actual values of the invariable parameters 0". To determine the detection 
probability Qj and the average number of samples precisely is a difficult problem, 
and again approximations are sought. 

For simplicity of notation we combine the signal strength 5 and the invariable 
parameters 0" into a single vector = (5, 0"). Any random parameters 0' are as 
before assumed independently random from one input to another, and we employ 
the probability density function P](g\ 0) of the statistic g after averaging over them. 
The absence of a signal wc denote by = 0; P\{g\ 0) = Po{g). 1 

Wald's approximations for the detection probability and the average sample 
number involve the moment-generating function of the logarithm u of the likelihood 
ratio, defined in (9-3): 



As wc showed in Sec. 5.2.1, the moment-generating function h{z\ 0) is a convex U 
function of the parameter 2 when z is real. The equation 



has in general two roots, one of which is r - 0. The second root, which is the one 
that figures in the subsequent analysis, is a function of the invariable parameters 




A(r;0) = 1 



(9-12) 



It is recommended that the reader work Problem 9-1 in the course of reading this section. 
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6 = (S, 0"), and we designate it by z(8). It takes on the special values 

2 (0) = -1, 2(6,) = 1, 

where 0. s = (S s , 8?)- 

When, for instance, the data gj are Gaussian random variables, as in detecting 
a signal of known form, but unknown amplitude in Gaussian noise, 

u ~ d s g- \4, 

and 

h(z; d) = exp[-(d s d - ±df)z + \d}z 2 l (9-13) 

where 8 = (d); d} - 2E S /N is the signal-to-noise ratio of the standard signal, and 
d 2 is that of the signal whose probability of detection is to be calculated. The root 
of (9-12) of concern is then 

z(rf) = ~-l. (9-14) 

d s 

The sequential test, as we have seen, is most appropriate when the input signal- 
to-noise ratio is small and the average number of stages before termination is large. 
Then one can in general neglect the cumulants of the statistic u of higher order than 
the second and approximate the moment-generating function by 

h(z; 8) » exp[-£(wl 9> + \z % Var(«| 0)], (9-15) 

whereupon the root z (8) is approximately 

When E(u\ 8) = 0, the equation (9-12) has a double root at z = 0. This usually 
occurs for a signal-to-noise ratio on the order of one-half the standard ratio and 
necessitates special treatment in the derivations to follow. 

9.2. 1 The Detection Probability 

The probability Qd(Q) of detection, averaged over the random parameters 6', is given 
approximately by Wald's formula: 

where z is the nonzero root of (9-12). For = and 8 - S , this reduces to 

Qd{ * s) " ^rfi A:=e "' B = eb ' (9 " 18) 

which are equivalent to (9-10) and hence to (9-6) treated as an approximation. 

Wald derived (9-17) by the following reasoning [Wal47, pp. 48-52]. When the 
parameter z is a root of (9-12), the function 
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can be treated as a probability density function, for it is both nonnegative and 
correctly normalized. Let H be the hypothesis that the datum g has the probability 
density function P\(g\ 8), and let H* be the alternative hypothesis that its density 
function is q{g\ 8). We consider a sequential test X* that tests hypothesis H* against 
the null hypothesis H. At its fcth stage it forms the accumulated likelihood ratio 



A '" n iM^e)-i_M p { gJ ) 



j=\ 'voy. / j=] 

and compares it with two decision levels A* and B*, deciding for H if A£ < £*, for 
H* if A* > and taking another observation if 5* < A* k < A*. 

Suppose first that z < 0, as when the signal-to-noise ratio is small. Then if we 
take A* = A~~ z , B* = B~\ the test X* is carrying out the very same operations as 
our original sequential test, which we denote by 2. The test X compares 

with the levels A and B, choosing hypothesis H) when A* exceeds A, and so on. 
When the data are such that test X selects hypothesis H\ z the test X* selects H*. We 
want the probability Q cl (d) that X selects H { when the probability density function 
of the data is P\(g\ 8), and this equals the probability that X* selects H* under 
hypothesis H; that is, it is the false-alarm probability g * for test X*. That false- 
alarm probability is approximately, according to (9-18), 

y ° ~ A* - B* A~= - B-= ™ e-"= - e-*-- ' 
which is just the expression (9-17). 

For z > the passage of A* of (9-19) above the level A, whose overall proba- 
bility we seek, corresponds to the passage of A| below A~ : . If we set B* ~ A~ z and 
A* ~ B~ z , this event entails the test X* choosing hypothesis H, and now (. 

SrfW = I - fio- 

If we apply (9-18) with A and B replaced by B' : and A~ z , respectively, this becomes 

A* ~ B* A* - 5* 5-- - ' 

which again reduces to (9-17). 

In order to calculate the probability Q<t(Q) of detecting a signal with parameters 
6 = (5, 8"), therefore, one must solve (9-12) for its one root z that differs from 
0. That value of z(8) is then substituted into (9-17). As mentioned before, when 
E(u\ 8) = 0, (9-12) has a double root z = 0. Applying L'Hopital's rule to (9-17) 
then yields 

Grf(O) « ^r, E(u\ 8) - 0. 

a — b 

When the input signal-to-noise ratio and the standard signal-to-noise ratio are 
both very small, one can use the approximation (9-16) for 2(8). The resulting proba- 
bility 2</(9) is then the same as that calculated by a different approach from Wald's 

Sec. 9.2 Performance of the Sequential Test 343 



[Bar46]. The increments uj in (9-4) are now very small, on the average, and many 
stages must usually be traversed before the sum Uk in (9-4) crosses one boundary or 
the other and the test terminates. A good approximation to the detection probability 
can therefore be obtained by considering the trajectory of the sum V as a Markov 
process in continuous time, replacing the stage-number k by a continuous variable 
t. Were there no barriers at a and b, the variable U would have a Gaussian density 
function by virtue of the central limit theorem. It would be executing a Brownian 
motion, with drift velocity E(u\ 0) and diffusion constant D = Var(w| 0), and its 
probability density function would satisfy a Fokker-PIanck equation [Hel91, p. 447], 
[Pap91, p. 652]. By solving that partial differential equation with absorbing barriers 
at U ~ a and U = b, one can determine the probability that U crosses level a be- 
fore crossing level b. This is the probability Qdify of detection and turns out to be 
the same as that given by Wald's approximation (9-17) with «(©) specified by (9-16) 
[Hel68, pp. 68-70], The advantage of this approach is that one does not need to as- 
sume that the increments uj are logarithms of likelihood ratios as in Wald's analysis; 
uj may be only an approximation to such, usually the threshold approximation. 

In Fig. 9-1 we have plotted the probability Qd(d) of detection for a sequential 
receiver of a coherent signal in white Gaussian noise versus the signal-to-noise ratio 
d = (2£/N) l/2 . The standard signal has a signal-to-noise ratio dj = 1, and it attains 
the reliability (Qo ™ 10" 6 , Qds ~ 0.99). The straight line in the figure represents the 
detection probability attained by the Neyman-Pearson receiver that sums M = 50 
data, as in (8-19); it is carrying out a fixed-sample-size test and attains the same 
reliability for the same standard signal-to-noise ratio. 

9.2.2 The Average Sample Number 

The number of stages in the sequential test, or the number of inputs vj(t) used 
to make the final decision, is a random variable, which will differ from one trial 
of the test to another, even though the parameters are the same. To judge the 
performance of the sequential receiver it is important to know the average number 
of stages «(0) = E(n t \ H\, 0) as a function of the invariable parameters 8; here n, 
is the number of the stage at which the test terminates. Wald [Wal47, pp. 52-54] 
showed this "average sample number" to be approximately 

L{U\ O) 

in which the denominator is the expected value of the logarithm of the likelihood 
ratio when a signal with parameters is present. The probability Q^O) is given by 
(9-17). 

This formula is based on the exact relationship 

E(U ( \Q) = E(u\fyE(n,)&), (9-21) 

where U, is the value of the sum Uk in (9-4) at the stage n, at which the test 
terminates. This, random variable can be written 

n, oo 
y=3 ./=1 
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0.5 1.0 
d= {2£/A/) K 



1.5 



Figure 9-1. Probability Qo(d) for sequential and fixed-sam pie-size tests, coherent 
signal in Gaussain noise, versus signal-to-noise ratio d = (2E/N) ]/2 : Q = 10"~ 6 , 
Q* = 0.99, d s = 1, M = 50. 



in which the y } are random variables such that y } = I if the test passes through 
stagey (j < n t ) and y } = if it does not (j > n,). Whether the test reaches stage j 
depends only on the values of the increments for i < j, and the random variables 
yj and w, are statistically independent. The expected value of V t must therefore be 

E(u f \ e) = £ E (yA e > £ («yl 0) = E{u\ 9) X £0 V | 9) 



= £(«ie)£(»,ie), 



as in (9-21). 

Because the increments uj are very small, the variable U t is very close to either 
a or b when the test terminates, and therefore 

whence (9-20). 

When E(u\ 0) = and *(9) = 0, (9-20) must be replaced by 
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Var(«|0)' 

One obtains this by taking z small in (9-17), expanding the numerator and denomi- 
nator in powers of z, and keeping terms through second order in z. One then finds 
for the numerator in (9-20) 

aQM + b{\ - Q d {9)} = -\abz + 0{z\ 

and as (9-16) is valid in this same limit, 

E(u\ 0) = \z Var(a| 0) + 0(z 2 ), 

and (9-20) reduces to (9-22). 

As a function of the signal strength, the average number n of stages exhibits 
a peak between zero and the strength S s of the standard signal. As S increases 
beyond the standard value, the average sample number n decreases toward zero. It 
is appropriate to compare the value of «(0) with the number M of stages needed by a 
fixed-sample-size test that attains the same reliability (Qo, Q<i S ) as the sequential test 
when detecting the same standard signal. In general «(0) and n(Q s ) are less than this 
number M: In the absence of any signal or in the presence of the standard signal, the 
sequential test attains a specified reliability with a smaller average number of stages 
than required by the rlxed-sample-size test. For signal strengths midway between 
and the standard, however, the average number «(0) of stages may exceed M. 

A further aspect of the randomness of the number «, of stages before termi- 
nation is that it may occasionally be very large. The probability mass function of 
the number n t is difficult to calculate, but some indication of the variability of n ( 
is provided by its variance, for which the following rather complicated and approxi- 
mate formula can be derived by treating the behavior of the sum U as a continuous 
Markov process, 

Var n t * [£(«)]- 2 [l - j^p^M Var u ~ 3 < fl " b ?&Q ~ &>» (9 _ 23) 

Z = 2(0), 

with all expected values taken for the set of parameters 0. This approximation is 
valid under the same conditions as the previous ones. For such parameter values 
that'll 0) = 0, we must again apply L'HopitaPs rule, and we obtain 

V ar»'*-w4-W> £(«|6) = 0. 
3[Var(«| 0)f 

At signal strengths intermediate between and the standard one, both n(0) and 
Var n t are relatively large. Signals of these strengths may draw out the sequential 
test to inordinate lengths before a decision is reached. If such signals are likely to be 
present, it may be advisable to force the test to yield a decision after a fixed number 
of stages. Such a truncation of the procedure affects the reliability in a way that is 
difficult to calculate. 

For the fixed-sample-size test (8-19) with Gaussian data, the false-alarm and 
detection probabilities are 

00 = erfc a, Q d = erfc (3, = a - d s jM> 
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where d] is the standard signal-to-noise ratio, whence 



M = ^—jP-- (9-24) 



In the sequential test, by (9-13) and (9-14), 

2 W J 



E{u\d) = Uh. 



The ratio of the average sample number to the number M of data needed by the 
fixed-sample-size test is then, from (9-20) and (9-24), 

«V) _ + b[l - Q tl (d)]\ 

independently of the standard signal-to-noise ratio. Through (9-17) the right side is 
a function only of z, which for Gaussian data is given by (9-14). It is conjectured 
that (9-25) is approximately valid for any sequential probability ratio test in the limit 
of very small standard signal-to-noise ratio, whereupon E(n,) » 1 and M » 1. 

The ratio in (9-25) has been plotted versus z in Fig. 9-2. It exceeds 1 when 
z is in the neighborhood of 0, that is, for signal strengths roughly equal to one- 
half the standard signal strength. In that figure the lengths of the error bars are 
(Var n,) W2 / M, where Var n t is given in (9-23). At intermediate signal-to-noise ratios, 
not only is the expected value of the number of stages before termination very large, 
but its standard deviation is also large. The expected values of the increments uj 
of the logarithmic likelihood ratio U are then small, and U crosses one barrier 
or the other only after a very large and very variable number of data have been 
accumulated. 



9.3 SEQUENTIAL DETECTION OF SIGNALS OF 
RANDOM PHASE 



When the inputs 

v k (t) = n k (t) + s k (t;B) (Hi) 
contain signals s k (i; 6) that are quasiharmonic pulses 

s k (t; 8) = A Re F(t; Q") expO'fl; + ity k ) 

with phases if» A - independently random from one signal to the next, but with a com- 
mon though unknown amplitude A, not to be confused with the upper threshold in 
the sequential test, we can use the methods just described to set up and evaluate a 
sequential test for deciding whether a train of such signals is present in the inputs 
v\{t), V2(t), ... , successively observed by the receiver. We shall briefly summarize 
the necessary formulas, leaving their derivation to the interested reader. 
As data we take the random variables 



rT 



rt = C 



F\t)V k (t)dt 



\F(t)\ 2 dt 



1-1/2 



(9-26) 
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Figure 9-2. Ratio of average sample number E(n,) ~ n(d) to number M of 
data in equipollent fixed-sample-size test with Gaussian noise versus parameter 
z: Qo = 10~ 6 , Qds = 0.99. 

where V k (t) is the complex envelope of the kth input and F (t) equals F(t; 6"), the 
complex envelope of a signal with the standard set 0" of invariable parameters; N is 
the unilateral spectral density of the noise, assumed white and Gaussian. Let 




be the signal-to-noise ratio for a signal with amplitude A, and denote the standard 
signal-to-noise ratio by D}. As in (3-117) the probability density function of each 
datum is 

Pi(r; D) = r e-* (rl + Dl %(Dr)U(r) 

when a signal is present with amplitude A; when no signal is present, the density 
function of the r k 's is 

The logarithmic likelihood ratio is 

u = = I" 'oUM - & (9-27) 

and is a monotone function of the datum r. At the end of each interval of du- 
ration T the logarithmic likelihood ratio u is determined from the input Vk(t) = 
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Re V k (t) exp iCit and added to the sum U carried over from the previous interval. 
The new value of U is compared with the decision levels a ~ In A and b = In B 
specified by (9-10). If V < b, the receiver decides that its inputs have contained 
only noise; if V > a, it decides that signals of some positive, but unknown ampli- 
tude have also been present. In either event the test ends. If b < U < a, the receiver 
causes another pulse to be transmitted and processes the input v k +\(t) during the 
subsequent interval in the same manner. 

To determine the probability Q d (D) of detection as a function of the signal- 
to-noise ratio D, one must first calculate the moment-generating function of the 
logarithmic likelihood ratio u, which is given by (9-11), 



Pi(r;D)dr 

( ) J (9-28) 

= exp(i£ s 2 2 ) [h{D s r)rP ] {r\D)dr. 
Jo 

This function cannot be obtained in closed form. A power series can be developed 
by expanding the function [I (x)]~ z in powers of x, substituting into (9-28), and 
integrating term by term by means of (C-5). In terms of S s - jDj and S = \D 2 , 
the result is 

h(z;D) = 1 + iz(3 + z)S?[\ - f(2 + z)S s + \{\ \ + llz + 3z 2 )S s 2 ] 

- zSS s [\ - (1 + z)S x + 1(1 + Z )(4 + 3z)Sf] + \z{\ + 2z)S 2 S* + - . (9 ' 29) 

When a signal whose signal-to-noise ratio is D is present, the expected value and 
the variance of the logarithmic likelihood ratio u are 

E{u\ D) = -\S}{\ - \S S + %S}) + 55,(1 - S, + 2S?) - \S 2 S} + 
Var(w( D) = S?(l + 2S - 2S X - 6S S S) + <)(£?), (9_30) 
and from (9-29) the root of the equation A(z; D) = 1 is approximately 

2S 2D 2 

v>} 

These results can be put into the equations given previously for calculating the 
probability of detection, the average sample number, and the variance of the number 
of stages. The approximation is most reliable when both S and S a are small and the 
average sample number n(D) is very large. 

It is a temptation to approximate the logarithmic likelihood ratio in (9-27) by 
expanding the logarithm as in (4-14) and keeping only the term proportional to r 2 . 
The system would then accumulate the statistics 

«• = -W + {oy 

from each observation interval. However, as Bussgang and Mudgett [Bus60] and 
Blasbalg [Bla61] pointed out, the expected value of u' vanishes when no signal is 
present, E(u'\ H ) = 0, and the test may run through many stages before a decision 
is reached. They suggested replacing the term of fourth order in the expansion 
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by its expected value under hypothesis Hq to obtain the approximate test statistic 

u" = + \ D y - \ D *. 

By solving the integral equations arising in the exact analysis of the sequential test, 
Kendall [Ken65] calculated the false-alarm probability Qq and the average number 
h(0) of stages under hypothesis Hq for a sequential test using the statistic 

u = - x jD} + \Dy + 3, 

where p is an arbitrary constant. He found that both Q and «(0) are much larger 
when 3 = and u - u' than when (3 = — D*/8 and u = u". The principal reason 
for preferring u' or u" over w of (9-27) is that a quadratic rectifier is more easily 
constructed than one having the characteristic In /of*). The approximation actually 
used, however, must be carefully selected if the advantages of the sequential test are 
to be preserved. 



9.4 SEQUENTIAL DETECTION OF TARGETS OF 
UNKNOWN DISTANCE 

In searching for targets by means of radar, it is usually necessary to detect those 
lying anywhere in a range many times longer than the spatial length of a signal. The 
echoes may arrive at any time during the interpulse interval T p between transmitted 
pulses. We treated the problem of detecting a signal of unknown arrival time in 
the input v(t) during a single interpulse interval in Chapter 7, and here we wish to 
describe some efforts to apply sequential detection to this task. 

It is customary to divide the interpulse interval T p into subintervals of a dura- 
tion T' on the order of the reciprocal bandwidth of the signals to be detected. The 
total number of subintervals, or range bins, will be denoted by L = T p /T'. Atten- 
tion is focused on signals whose leading edges reach the receiver at the beginning of 
a subinterval; they are substantially past by the end of the same subinterval. In our 
discussion we suppose the target to be stationary, its echoes appearing at the same 
relative position in each interpulse interval. 

The receiver contains a filter matched to such a signal over an interval of 
duration T', and the output of this filter is rectified and sampled at the end of each 
subinterval to provide a statistic r of the form of (9-26). (In practice the matching 
may be imprecise.) During the yth interpulse interval T p the receiver thus generates 
L samples r\'\ ... , ri", which are statistically nearly independent. It is on these 
that its decisions are based. 

In the sequential detection system proposed by Marcus and Swerling [Mar62] 
it is postulated that at most one signal is present and that it may appear with equal 
probability \/L in any subinterval. At the end of each interpulse interval the receiver 
forms an average likelihood ratio for the data received since the beginning of the test. 
Under the approximation that the data rj/* are statistically independent, the average 
likelihood ratio at the end of the mth interpulse interval is 
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where P n (r) is the probability density function of the datum r when no signal is 
present, and P { {r) ~ P { (r; 8 S ) is the probability density function of r when a stan- 
dard signal is present, an average having been taken over any random parameters 
such as the phase. Because a standard signal can appear in no more than one subin- 
terval, the terms in the summation in (9-31) contained factors Po(r ( /'), i f s, which 
canceled from numerator and denominator. 

This average likelihood ratio A m is compared with decision levels A and B 
determined by Wald's approximations as in (9-10), and the decisions are made as 
previously described. If B < A„, < A, the transmitter is ordered to send out another 
pulse, and data are collected from the following interval to permit forming a new 
likelihood ratio A m+ i. Marcus and Swerling simulated such a sequential test on a 
digital computer and obtained average sample numbers as functions of the signal-to- 
noise ratio and the reliability of detection. The test was found to require a somewhat 
smaller signal~to-noise ratio than a fixed-sample-size test with the same reliability 
and a total number M of stages equal to the average sample number of the sequential 
test. The greater the number L of subintervals, the smaller the saving in signal-to- 
noise ratio. They also observed that for signal strengths intermediate between and 
the standard strength the average number of stages did not become much larger than 
the average sample numbers for the zero and the standard strengths. Truncation of 
such a sequential test to avoid excessive lengths may be unnecessary. 

An alternative suggested by Kendall and Reed [Ken63] allows a signal to appear 
in any subinterval with a probability q; the probability distribution of the total 
number of targets is then binomial. They presented the form of the likelihood ratio 
for this test and pointed out that as with the Marcus and Swerling test, simulation 
would be necessary to evaluate its performance. 

In another form of sequential detection a sequential test is applied to the data 
from each subinterval. Pulses are transmitted in a certain direction until the tests of 
all L subintervals have terminated. If at the kth stage, for instance, the sequential 
test for the jth subinterval has not yet reached a final decision, the receiver will use 
the data rf, rf, ... , rf to form a likelihood ratio 

Ar=A ( ^^...,^=n^g, 

Poirf) 

which is compared with two levels A and B. If A;' < B, the receiver decides that 
no target is present in the jth subinterval, and if A;' > A, it decides that there is a 
target there. In either event the test for the jth. subinterval terminates, and the data 
rf from future interpulse intervals (i > k) are disregarded. If, on the other hand, 
B < Aj' < A, the jth subinterval will be examined again after the (k + l)th pulse 
is transmitted. 

The average number of pulses needed to attain a specified reliability with such 
a system was calculated in some representative cases by Reed and Selin [Ree63] and 
Bussgang and Ehrman [Bus65]. The writer worked out the median number of pulses 
transmitted when no signal is present, as a function of the number L of subintervals. 
This number roughly determines the average time needed to scan a fixed portion of 
the sky that is empty of targets [Hel62]. 
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In that calculation it was assumed that the level a ~ In A is so high and the 
false-alarm probability Qo so small that under hypothesis Hq the crossings of the 
upper level a can be neglected. Then one can use a formula given by Wald [Wal47, 
p. 193] for the probability distribution of the number h, of stages before a single 
sequential test terminates: 

Pm = Pr( „, < m ) = (£)' /2 JV* exp[-A (x _ dx 



= erfc! 



[g)- (1 -,)] + ^e rf e[e)' /2 a + ,)], 



(9-32) 



where -n - m/n, n = n(0) = £(« ( | Hq), and a = « 2 / Varo n t . In this situation, with 

6o<<1 ' -™ * 2 * 

«(0) « 



by (9-20) and (9-30), and 



by (9-23), so that 



Var n t 



E(u\Ht>) S} 
n Varo u 



[E(u\ Ho)] 



a « -i* = ifln(l - Qo)l 

One can also derive (9-32) from the approximate representation of the trajectory of 
the sum V as a Markov process in continuous time. 

If nf denotes the number of radar pulses that need to be transmitted before 
the sequential subtests in ail L range bins terminate, 

Pr(«, a) < m) = p£, 

for n l t L) < m if and only if all L of the subtests have terminated with fewer than 
m + 1 stages. The median v of the number of transmitted pulses is therefore given 
by 

i = P V L or P v = 2-^ L . 

The quotient v/n can be obtained by setting P m = 2" l/L in (9-32) and solving for 
the parameter -n. Denote this solution by iru. 

We want to compare this median v with the number M of stages needed by 
the Neyman-Pearson receiver to attain the same reliability {Q , Q&). That receiver 
can be regarded as basing its decisions on the sum 

M 

in each range bin, where the r k are defined in (9-26). This is the threshold receiver, 
as in (4-19); the input signal-to-noise ratio is assumed so small that the weak-signal 
approximation can be made, and M :» 1 . Then the sum G will have approximately 
a Gaussian distribution, and the false-alarm and detection probabilities are 

<2o «. erfc go, Q* a erfc gi , g\ = g - S s Vm, 
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Figure 9-3. Comparison of sequential and conventional detector: v/M versus 
L « Aco T; Qq = ]0 -6 . Curves are indexed with the probability Qj of detection. 

for S s = |Z)j is the effective signal-to-noise ratio, defined as in (4-66). Hence the 
number M of pulses required by the Neyman-Pearson receiver is 

The ratio v/ M of the median number of pulses required by the sequential 
receiver to the number required by the standard receiver is now 

v_ _ i)\7i ^ 2/)T]i 
M~ M * ( gi -g )2' 

and it is independent of the standard signal-to-noise ratio, provided that the standard 
ratio is small. The ratio v/M is plotted in Fig. 9-3 versus the number L of range 
bins for detection probabilities Q (l of 0.9, 0.99, and 0.999 and an overall false-alarm 
probability 

fin" = 1 - (1 - Qo) L * LQo = 10~ 6 
for all L subintervals. The ratio v/M ranges from about 0.3 to about 0.6 as L 
increases from 10 to 1000. Thus the sequential receiver makes it possible to scan 
a certain part of the sky in about half the time needed by a conventional receiver 
when no or few targets are present. 

If the radar target might be moving rapidly, the receiver must contain a num- 
ber of Doppler channels, that is, a bank of filters matched to the signal, but with 
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pass frequencies distributed over the range of expected Doppler shifts about the 
carrier frequency of the transmitted pulses. The output of each such filter during 
each interpulse interval will again be divided into a number of range bins, and a 
sequential test will be conducted in each as just described. Each Doppler channel 
must compensate for the changing arrival times of the echoes from a target moving 
with the corresponding velocity. 

The total number of effectively independent tests will be on the order of 

L = (A<aT)(ktW d ), 

where Aw is the rms bandwidth of the signal and At is its rms duration, as defined 
in (6-105) and (6-106), and W d is the width of the expected range of Doppler shifts. 
With this definition of L, Fig. 9-3 will again roughly represent the saving in the 
average number of transmitted pulses achieved by using a sequential rather than a 
fixed-sample-size test. 

Problems 

9-1. The signals Sk(t) are known completely except for a common amplitude A. They arrive 
at a known time in the midst of white Gaussian noise of unilateral spectral density 
N, and in the £th observation interval of duration T the input to the receiver is either 
(Hd)Vk(i) = »k(t) or (H]) Vk(t) = Skit) + «*(0- Take a standard signal of energy E, and 
signal-to-noise ratio d s = (2E s /Ny /2 t which is to be detected with probability 
and set up a sequential probability ratio test of the kind described in Sec. 9.2. 

(a) Express the logarithmic likelihood ratio uj for the jth interval in terms of the input 
Vj(t) during that interval. 

(b) Show that w, is a Gaussian random variable and derive its expected value and 
variance when a signal of energy E is present. 

(c) Let d - (2E/N) 1/2 . Find the moment-generating function h(z,d) of the statistic 
u, and prove (9-14). 

(d) Write down formulas for the probability Q ( /(d) of detection and the average number 
n(d) of stages of the sequential test. 

(e) Plot n(d) and Qd{d) as functions of the signal-to-noise ratio d for < d < d s = 1, 
with Qti(d s ) - 0.90, Qo - 10" 6 . Use the approximations given in this chapter. 

9-2. Work out a sequential system for deciding whether the variance of a Gaussian noise 
input is Nq or. A/j, N\ > No. The second variance M may represent the sum of the vari- 
ance Nq of ordinary noise and the variance N x - N\ — No of an independent noiselike 
signal added to it. Such a signal could be the output of a radar-jamming transmitter. 
Available are independent samples x\, X2, ... , of the input; their expected values are 0. 

(a) Find the logarithmic likelihood ratio Uj =-ia[p\(xj)/po(xj)], where po(xj) and 
p\(xj) are Gaussian probability density functions of expected value and variances 
No, N u respectively. 

(b) Show that if the true variance of the data is N, N > No, the moment-generating 
function of the statistic u is 

(c) Calculate the expected value and variance of the statistic u when the true noise 
variance is N. 
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(d) Show how to determine the probability of detection and the average number n of 
stages of the test as functions of N. Use the approximate formulas given in this 
chapter. 

9-3. Outline a sequential system to detect the O's and l's transmitted over the Rayleigh- 
fading channel of Sec. 4.2.3. Suppose that there is a noiseless channel whereby the 
receiver can tell the transmitter when to stop repeating one symbol of the message and 
go on to the next. Assume that the signal amplitudes and phases are independently 
random, and set the system up for signals arriving with amplitudes drawn from a 
standard Rayleigh distribution (4-17) with the parameter s equal to s . Determine the 
logarithmic likelihood ratio u, and calculate its moment-generating function, expected 
value, and variance when the true parameter of the Rayleigh distribution is s [Bas59]. 

9-4. In a receiver of optical signals, light falls on a photoelectric detector, and during suc- 
cessive intervals of duration T the receiver counts the numbers «, , n 2 , . . . , of photoelec- 
trons ejected by the light. The receiver must decide between hypothesis H that only 
background light is present at its input and the hypothesis Hi that an optical signal is 
also present. The probabilities of counting rij electrons in the jth interval under each 
hypothesis are 

\ k 

Pr(H; = k\ H) - exp(-\ ; ), i - 0, I, \j > \ . 
k\ 

The numbers tij in successive intervals are statistically independent. 

(a) Show how to set up a sequential detector of the optical signal by using Wald's 
sequential probability ratio test, assuming a standard expected value \u for the 
numbers rtj under hypothesis H i . Here the data are governed not by probability 
density functions, but by probability mass functions. 

(b) Using Wald's approximations, determine the decision levels. 

(c) Calculate the moment-generating function of the statistic u that is accumulated 
at each stage of the test, and show how to determine the probability of detection 
and the average number of stages when the expected number of electrons under 
hypothesis H\ is not X L? , but some other value \| >■ \q. 

(d) Show how to compare the performance of this test with that of a fixed-sample-size 
test that decides between hypothesis H and hypothesis H x on the basis of the total 
number « ( + n 2 + ■•■ + % of electrons counted in M successive intervals. 
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Signal Resolution 



10.1 SPECIFICATION OF RECEIVERS 

10.1.1 Varieties of Resolution 

Up to now we have studied the detection only of individually identifiable signals. 
Radar targets, however, may be so close together that their echoes appear indistin- 
guishable, and the receiver must decide whether to attribute its input to one signal, 
to several, or to none. A signal overlapping a weaker one of the same kind may 
well conceal it, as when the radar echo from a large bomber hides one from a small 
fighter plane nearby. To identify a ballistic missile among a cloud of decoys requires 
sorting out a multitude of echo signals. The echo from a low-flying aircraft arrives 
in the midst of a throng of weak, random signals reflected from the ground; these 
create the interference known as clutter. In a ground-mapping radar it is the detailed 
structure of the reflections themselves that is of interest. 

The process of deciding whether the input to a receiver contains one signal 
or a number of adjacent signals is called resolution. The term is borrowed from 
optics, which has long been concerned with the efficient resolution of close images. 
One speaks of resolving the echoes of the fighter plane and the bomber or of the 
decoys and the missile. A ground-mapping radar that accurately reproduces details 
of the terrain it scans is said to provide good resolution. The quality of signals 
determining whether they can be easily resolved is called their resolmbility. Echoes 
of the same transmitted radar pulse from diverse targets may differ in several respects: 
in time of arrival t because of a difference of target distances, in carrier frequency 
Q, by virtue of a difference of target velocities, and in the antenna azimuth 6 for 
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maximum amplitude, owing to a difference of target bearings. Any combination of 
such parameters may be utilized to resolve the signals. 

The study of radar resolution falls into two parts, signal design and receiver 
design. Most attention has been given to the former, which seeks transmitted pulses 
whose echoes are most easily resolved. The latter, concerned with how the receiver 
should be modified to accommodate signals that may overlap, has been less widely 
developed, possibly because it is so complex. 

There are various aspects under which the resolution of signals can be consid- 
ered, and they are exemplified by the situations mentioned in the first paragraph. 
One can suppose that at most two signals might be present and that the receiver 
must either detect a weak signal in the presence of a strong one or decide whether 
two signals, one, or none is present in its input. Alternatively, the possibility that an 
arbitrary number of signals are present, can be confronted, and one can ask the re- 
ceiver to estimate the number of signals as well as their arrival times, frequencies, or 
other parameters. A third type of problem is the detection of a given signal against 
a background of many weak, random echoes such as make up the ground clutter; it 
can often be viewed as the detection of a signal in colored Gaussian noise. Ground 
mappers and similar radars can be regarded as measuring the scattering function of 
a surface or a volume of space, and their operation can be treated as the estimation 
of a random process. The first three of these aspects of resolution will be discussed 
in this chapter; the fourth is beyond the scope of this book. Ground-mapping radars 
are treated in [Rih69, pp. 441-83], [Har70], [Mor88, pp. 157-85], [Bla91, pp. 39-501, 
and [Cur91]. 

In what follows we shall assume that the signals are corrupted by white Gaus- 
sian noise of unilateral spectral density N. The modifications needed to accommodate 
colored noise are mostly evident, but to make them is uninstructive. Although signal 
resolution will be analyzed from the standpoint of hypothesis testing, it will be found 
that for any useful progress compromises must be made and something less than an 
optimum system must be accepted. 

10.1.2 Resolution of Two Signals 

In the most elementary and least common situation, one or the other, or both or 
neither, of two signals may be present in the input v(t) of the receiver. We take them 
to be narrowband signals of the form 



having independent and unknown phases and amplitudes and arriving in the presence 
of white Gaussian noise. We suppose that the functions F{t) and G(/) are given, 
F{t) £ G(t\ and that they are normalized so that 



S h (t) 



AF{t) exp 
BG(t) exp i<j)2, 



s h (t) = ReS b (t)e 



(lO-l) 
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The former signal will be called signal A, the latter signal B. Signal B may be a copy 
of signal A arriving earlier or later, G{t) = F(t - t), with a given delay t. On the 
basis of the input v{t) received during the observation interval (0, T), the observer 
must choose one of four hypotheses: (H ) neither signal A nor signal B is present; 
(H\) signal A alone is present; (#2) signal B alone is present; and (#3) both signals A 
and B are present. A prescription for making this choice can be termed a resolution 
strategy. We treat this problem because it provides an insight into how the forms of 
the signals affect their resolvability and because it represents a simple special case 
of the more general problem to be analyzed in Sec. 10.1.3. 

If no noise were present, the observer could decide without error which of the 
four hypotheses is true by passing the input through properly matched filters and 
observing the output. It is easy to verify that the following two test statistics, among 
others, yield the desired information: 

r=W }V - (o-xG*(o]^(o*. 



A = 



B = 



(10-3) 



u 



where the complex scalar product 

\ = f 7 F\t)G(t)dt (10-4) 
Jo 

of the two signals measures the extent to which they overlap. 

If hypothesis H is true, A = B = 0; under hypothesis H\ , A = A and B - 0; 
under H 2 , A - and B = B; and under H 3 , A - A and B ~ B. If either of the 
quantities in (10-3) vanishes, the corresponding signal is absent. From our work 
in Sec. 3.3.1 we know that the quantity A can be generated by passing the input 
v{t) through a narrowband filter matched to a signal having the complex envelope 
[F(t) - \*G(t)]/(\ - |X| 2 ), that is, through one with a complex impulse response 

- \\\ 2 )- l [F*(T - s) - \G*(T - s)], < s < r, 

s < 0, s > T. 

A similar filter with complex impulse response Ki,{s\ matched to a signal whose 
complex envelope is [G(t) - KF(t)]/(l - \k\ 2 ), can be used to form the statistic B. 
At the end of the observation interval the rectified outputs of these filters are the 
desired values of A and B. 

If the input contains additive random noise n(f )» the statistics A and B do not 
vanish, even when the signals are absent, and the observer is faced with a statistical 
problem. He must set up some strategy—utilizing perhaps the test quantities A and 
B, perhaps some other functional of the input v(t)-~ by which to make a choice 
among the four hypotheses that meets some standard of long-run success. To derive 
this strategy we turn to the theory of the statistical testing of hypotheses. 

The Bayes strategy of Sec. 1 . 1 can be applied to this problem if one is given 
a matrix of the costs Qj of choosing hypothesis H-< when hypothesis Hj is really 
true, and if one is also given the prior probabilities of the four hypotheses and the 
prior probability density functions of the amplitudes A and B. The Bayes strategy 
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involves calculating the average conditional risk CH #/I t>(0) of each hypothesis, 
defined as in (1-13), and the observer chooses the one whose conditional risk is the 
smallest. 

When some or all of the prior probabilities and costs are unspecified, the 
same difficulties arise as in the simpler detection problems treated earlier, and the 
elementary methods of testing hypotheses become inapplicable. Instead we adopt 
as a guide the method of maximum likelihood, introduced in Sec. 3.6.4 and applied 
in Sec. 7.2 to deal with the detection of a signal of unknown time of arrival. The 
resulting resolution strategy and its probabilities of success and failure shed some 
light on the resolvability of signals of this kind. 

The maximum-likelihood detection of a signal is in effect based on an estimate 
of its amplitude, derived on the assumption that the input v(i) actually contains the 
signal. If the estimated amplitude is too small, the result is attributed to a noise 
fluctuation, and the decision is made that no signal is present. Taking the same 
viewpoint here, we imagine that the input v(t) contains both signals, and we form 
the maximum-likelihood estimates of their amplitudes A and B. If the estimate A of 
the amplitude of the first signal is too small, the system decides that the first signal 
is absent, and the second signal is treated in the same way. If both \A\ and \B\ 
exceed certain decision levels, both signals are declared to be present. 

By (3-54) with 

5(0 = S«(0 + S b {t), (2(0 = AT'S(0, 
the likelihood functional is 

A[v(i); a, b] = expj^l ReJ^PM + SS(t)]V(t) dt - ± jjS,(0 + S h (t)f 

where V(t) is the complex envelope of the input. By introducing the complex signal 
amplitudes 

a = A exp i'(j>], b = B exp /(j) 2 , 
and the circular complex Gaussian random variables 

F*(t)V(t)cl(, 



dt 



\ (10-5) 



22 = f G*(t)V(i)di, 
Jo 

we can write it as 

A[z>(0; a, b] = expJJV-'fRefa^z, + b%) - ^()a\ 2 + 2Re \a*b + \b\ 2 )]}; (30 -6) 

X is the overlap integral introduced in (10-4). 

The likelihood functional is to be maximized with respect to the four variables 
A, 4>j, B, and <j> 2 or equivalent^ with respect to a, «\ b, and b*. By writing out the 
exponent in terms of these and differentiating with respect to each as though they 
were independent variables, we find the equations 

a + b\ - 2\' a\* + b = z-y 
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Figure 10-1. Decision regions in choice 
among four hypotheses. 



and their complex conjugates. Solving them, we obtain the maximum-likelihood 
estimators 

„ _ Z\ — \Zi ; _ Z2 — \*Z\ 



a — 



1-IXP' 



b = 



1 - IXP 



(10-7) 



of the complex amplitudes a and b. 

The estimators of the amplitudes themselves are then the statistics A = \a\ and 
B = \b[ given in (10-3). As we said there, these statistics can be generated by passing 
the input v(t) through appropriate narrowband filters, which are followed by linear 
rectifiers. The outputs of those rectifiers at the end of the observation interval are 
compared with a decision level 7, which can be so adjusted that if A < 7, signal 
A is declared absent; if B < y, signal B is rejected. Denoting once again the four 
hypotheses by Hq, H\, H2, and H3 and the decision regions by Rq, Ru R2, and R$, 
we can represent the decision strategy as in Fig. 10- 1. 

A false alarm occurs when no signals are present and any of the three hypothe- 
ses H\, H2, or Hi is chosen. We shall now describe how to calculate the probability 
of a false alarm. Under hypothesis Hq the circular complex Gaussian random vari- 
ables zj and Z2 represent pure noise. By (10-4), (10-2), (10-5), and (3-45) the elements 
of their complex covariance matrix «J>r are 

\E{z\zt\ H Q ) = \E{?2zl\ Ho) = N, 
\E{z^\ H Q ) = NX. 

Introducing the column vector z of the z's and its conjugate transposed row vector 
z + = (z 1, zj), we can write this covariance matrix as 



<j> 2 = \E{vl + \Ho) =.NA, 



(10-8) 
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With a the column vector of the estimators a and b, 

a = Kz, 



-\* i 

by (10-7). The complex covariance matrix of the estimators is 

$ ah = ^£(aa + | H ) = K(}> r K + = 7VKAK+ = NK + = NK, 
and their inverse covariance matrix is 

4>i = AT 1 A. 

The joint probability density function of the real and imaginary parts of a and b is 
therefore 



j?o{a, b) = xr ^ - exp 



(2ir) 2 /V 2 



iflj 2 + X*a£* + KcVb + |6| 2 



27V 



by (3-40). 

If we now put 



a = X exp z'6,4, Z> = 5 exp i'6 fi , 

we can write 

X*«fc* + = 2 Re \a*6 = 2\k\AB cos(arg X - Q A + 8 B ). 

Introducing the Jacobian AB of this transformation to polar coordinates and inte- 
grating over < 0^ < 2tt, < 8s < 2ir, we find for the joint probability density 
function of the estimators A and B under hypothesis Ho 

'Mab\ 

N )■ 



Po(A, B) = 1 J^ AB exp 



The false-alarm probability is obtained by integrating this density function over 
the region R Q in Fig. 10-1 and subtracting from 1, 



c ~ 



Qo = J ~ (1 - IXp) P fxy e~l lxl+ >"%(\\\xy) dx dy, , - 

jo Jo -JN 

and with some labor this can be reduced to a combination of Q functions, 
Qo = 1 - 2(1 - |X| 2 ) f [\xy c-^ 2+ y\(\\\xy) dx dv 



= 1 -2(1 -]X| 2 ) 



x e i 



dx 



= 1 - 2(1 - |X] 2 ) f jc e-KHM^^p - g ( jx|, v , x) ] dx 
Jo 

= ^ (HMV [l - fidMc, c) + fi(c, |\k)] 

in terms of Marcum's Q function (3-76). The last step is most easily verified by 
differentiating the result with respect to c and using (C-9) and (C-10). For 6-(l - |X|) 
» 1, an adequate approximation is 
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Figure 10-2. SignaJ-to-noise ratio (dB) required to attain a detection probability 
Qd - 0.999 for one signal overlapping another, as a function of the overlap pa- 
rameter J\|. The curves are indexed with the false-alarm probability go- 



Qo « 2 <r?(HMV (10 . 9) 

which corresponds to neglecting the integral over the region Ri in Fig. 10-1. 

When as under hypotheses Hi and H$ signal A is present, the probability 
density function of the real and imaginary parts of the circular complex random 
variable a is 

Ma) -^lT eX i 2N j 

As in Sec. 3.4, we find that the probability of correctly deciding that signal A is 
present is 

Qd = Pr(|«| > y\ Hi U H z ) = Q[d{\ - \\\ 2 ) 1 ' 2 , c(l - |\| 2 ) l/2 ], 

where d 2 = 2E/N is the input signal-to-noise ratio for signal A\ Qj is the probability 
of detecting one signal when it is possible that both may be present. In Fig. 10-2 
we exhibit the input energy-to-noise ratio S = \d 2 required to attain a detection 
probability Q d = 0.999 as a function of the parameter |\J, which measures the degree 
to which the two signals overlap. The closer |\| lies to 1, the more energy is required 
to attain a specified reliability (Qq, Q d ) in this decision. 
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10.1.3 The Resolution of Many Signals 



10.1.3.1 Maximum-likelihood detection. A radar may be called on to detect 
echoes arriving at any time t within an interval of long duration and having carrier 
frequencies anywhere within a broad range of values about the frequency O of the 
transmitted pulses. In addition, there may be times when a great many closely spaced 
targets are to be expected, and it will be necessary to determine both the number of 
targets and their ranges and range rales. If the echo signals might overlap in lime 
or frequency or both, the problem of resolving them is especially difficult. 

One approach is to impose on the space of the invariable parameters G" a 
rectangular grid of more or less uniformly spaced values and to concentrate on the 
detection of the M signals 

sj(t) = Aj Re F(t; 0j') exp(;fi/ + ity), j = 1,2,..., M, (10-10) 

having parameters 6" at the points of the grid. The signals F{t\ 0") are normalized 
as in (10-2). The spacings of the points will correspond to the degree of resolution 
that one wishes to attain with respect to the parameters 0". 

As in Sec. 10.1.2, the detection of these M signals can be treated by the method 
of maximum likelihood. It is assumed that the input v(t) ~ Re V(t) exp z'JTk con- 
tains all M signals, 

M 

V(t) = 5(0 + N(l), 5(0 = £ ajF{t; 0/), aj = Aj exp ity, (10-11) 

where N(t) is the complex envelope of the noise, assumed white. The logarithm 
of the likelihood functional for detecting the composite narrowband signal s(t) = 
Re 5(0 exp iCLt in white noise is, by (3-54) with Q{t) ~ S(t)/N, 

If 7 " If 7 

" S*(t)V(t)dt - — |5(0l 2 ^ 



1" AN0] = Re 
1 



2N 



+ a, 



2N) 

111 = ] '1=1 



where 



rT 



F*(t;$')V{t)dt 



(10-12) 



(10-13) 



is the sample, taken at the end of the observation interval (0, T), of the complex 
envelope of the output of a filter matched to the signal s m {t), defined as in (10-10). 
Furthermore 



^■jjii? 



F*(r;e^)F(/;ej)rf/, Kun = 1, 



(10-14) 



are elements of the ambiguity matrix A and measure the extent to which signals s m (t) 
and s„(t) overlap. 

The maximum-likelihood estimates of the complex amplitudes a,„ are obtained 
by differentiating (10-12) with respect to a*, and setting the result equal to zero, 

M 

=j = 1 Sj <M t (10-15) 
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and the solutions of these simultaneous equations are 



M 



a m = 2.k mJ Zj, (10-16) 

where the k mj are the elements of the M x M matrix 

K = A- 5 , A = ||X m „||. (10-17) 

The estimated amplitudes A m = \a m \ are compared with decision levels y m ; if A m > 
y m , the receiver decides that the mth signal, whose parameters are 8^, is present. 

If there were no noise and if the only signals permitted were those having 
parameter values 0", this system could correctly identify which signals are present. 
If a signal might arrive with intermediate values of the parameters, however, two 
adjacent estimates Aj might exceed their decision levels 7,-, indicating the presence of 
two signals when only one is there. This must be counted as an error of the system. 
Random noise introduces false alarms that signals are present when they are not. 
We concentrate now on determining the probability Q4 that the receiver correctly 
decides that an arbitrary one of the M signals sj(t) is present. 

When the «th signal s„(t) is present, a situation that we label as hypothesis H n , 

E{Zj\H n ) = \ jn a n , (10-18) 
by (10-13) and (10-14); and by (10-16) 

E{a m \ H n ) = a„ £ k mj \ jn = a n Z mtt m " «' (10-19) 

j=\ (.0- m ± n, 

because the matrices A and K are inverses; h m „ is the Kronecker delta. 

Let us denote by a the column vector of the estimates a m of the M complex 
amplitudes and by z the column vector of the M circular complex Gaussian random 
variables z m defined in (10-13). Then by (10-13), (10-14), and (3-45) 

\E{z m z* R \ Ho) = \ f fV(r,; Q»)F(t 2 ; tf)E[V(ti)V*(t 2 )\ H ] dt\ dt 2 
Jo Jo 

f T (10-20) 

= N\ F*(t;VjF(r t tydt=Nk Hmt 
Jo 

and the complex covariance matrix of the z m 's can be written 

<|» 2 = ±£(zz + |# ) = NA (10-21) 

as in (10-8). Because by (10-16) a = Kz, the complex covariance matrix of the 
estimates a m is 

4>« = i^(M + l Ho) = ^(Kzz + K + | Ho) = tfKAK* = NK+ = NK (10-22) 

by (10-17). Thus the variances of the real and imaginary parts of the estimate a„ 
are equal to Nk m , where k nn is the «th diagonal element of the inverse K of the 
ambiguity matrix A. 



364 



Signal Resolution Chap. 10 



If we denote by Qo the probability of deciding that the «th signal is present- 
accepting hypothesis H„ — when it is not, and by Q t / the probability of doing so under 
hypothesis H n that the nth signal is present, we find by the methods of Sec. 3.4 that 

eo = Pr(i H >7iil^o) = ^ cJ , 
Qd = Pr(i„ >y H \H H ) = Q(d n> c), 

in terms of Marcum's Q function. Here c is proportional to the decision level 7,, 
on the estimate A„ = \a„\ of the amplitude of the nth signal, and 

<Z = ^ (10-23) 

is the effective signal-to-noise ratio, where E„ = j\a„\ 2 is the energy of the wth signal. 

When we set the decision levels so that all the false-alarm probabilities Qo are 
equal, the overall false-alarm probability 

Ql ~ Pr(i, > 7, U A 2 > ii U •■■ U A M > 7wi H Q ) 

is bounded by 

Qi < MQ (10-24) 

because for any M events , 

fM ~| M 

\JeA <£Pr(£,); (10-25) 

this is called the union bound. The quantity MQo will be a good approximation 
to the over; 
for M ~ 2. 



to the overall false-alarm probability when Qq «; 1. It corresponds to (10-9) 



10.1.3.2 Resolution of signals in time. Let us now assume that the signals 
are uniformly separated in arrival time by 8 so that 

F(t; 6*) = F(t - t,„), t,„ = m5, 1 < m < M. (10-26) 

When as usually the observation interval (0, T) is much longer than the duration T' 
of each signal, the elements of the ambiguity matrix A are, by (10-14), 



■E 



F*(t - t,„)F (/ - Tb ) dt = \,„_,„ (10-27) 



and A is a Toeplitz matrix, each row of which is displaced to the right of the row 
above by one place. The elements of this matrix are . 



h = 



in terms of the Fourier transform 



= F*{t-jb)F(t)dt = \ l/(w)| 2 (10-28) 

J-CO J-00 ^ 



/(*>)= r F{t)e^'di 

J— oa 



of the complex envelope of the signals. If we now assume that M » 1 and disregard 
signals that might arrive near one end or the other of the observation interval, 
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T - Mb » T 1 , the inverse matrix K will also have nearly the Toeplitz form, and its 
elements will be 

K 

with 



;=-oo I. ^» 



m = «, 
rn ± n, 



(10-29) 



by (10-17). 

_ This convolutional equation can be solved by thinking of the matrix elements 
k p , k p as Fourier coefficients of periodic functions 



*(«)= £ 1(h) = £ X,*~'>", 

p=—O0 p=~oo 

and by the convolution theorem (10-29) yields 

K{u)L{u) ~ I. 
By (10-28) the periodic function £(w) is 



— TT < W < 7T, 



00 -00 , 

^> = Z |/( W )|V* 8 «-«> ^ 

. J-oo ZTT 

y=-oo 

•r 



£ = — 00 

'« - 2£<JT N 2 



k=—oo 

Here we have used the periodic delta function 

00 J CO 

X 5(* + 2*ir) = ^ X 

£=~O0 j=~CO 

The diagonal element of the inverse matrix K is now 



-J-/ (W) 2^ 



_8_ 
2tt 

2tt 



I |/( 

—IT f( = — OQ 

•it/8 



»— 2kTt \ 
8 ' 

dis) 



-it/8 k=—cc 



a ' 



(10-30) 



(10-31) 



(10-32) 



(10-33) 



(10-34) 



The denominator of the integrand is sketched in Fig. 10-3, in which we have marked 
the approximate bandwidth 2irW of the signals Re F(t - T„)exp/'fif in angular 
frequency. 

When the signals sj(t) do not overlap significantly, W§ » 1, then \j a for 
|y| ? 0. Because Xo = 1, k = 1; and the effective signal-to-noise ratio in (10-23) 



366 



Signal Resolution Chap. 10 



2tt 



2tt 
5 



Figure 10-3. Denominator of (10-34). 



is equal to the input signal-to-noise ratio. The signals do not interfere and can be 
detected as effectively as though they arrived alone. When W§ «c 1 , on the other 
hand, the peaks of the denominator of (10-34) are widely separated, only the central 
term (/c = 0) contributes significantly, and the effective output signal-to-noise ratio 
is approximately 



2E n 2ir 



do) 



(10-35) 



In order to attain a specified reliability, the input signal : to-noise ratio, which is now 



2E„ ^ r /s 



(10-36) 



must be very large. If the signals sj{t) were strictly bandlimited to W hertz, the right 
side of (10-36) would even be infinite when 1^5 < 1. 

Let us now work out the transfer function of a linear, narrowband filter whose 
rectified output, sampled at the appropriate time, yields the estimate A„ of the 
amplitude of the wth signal. The estimate a„ of the complex amplitude of that signal 
is given approximately as 



J-oo. J-co *• 



37) 



./=-« 



by (10-16). Here 



H„{t) = jr K H F{t ~/S) 



./=-<» 



has the Fourier transform 

CO CO 

M») = X %_„/((»>) (T" 8 " ■=/(«) e-* 8 " X ^ if** = exp(-(WT (1 ), 

j~~oa p-~oo 

where 
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= *(8to)/(o) = M 8/H (10-38) 

t z |/(»-¥) 

A"=-oo 

by (10-30) through (10-32). The estimates of the signals can therefore be generated 
by passing the input Re V(t) exp iflf through a narrowband filter whose complex 
impulse response is 

Kh(S) ~U *<0, s>T, ^ 

where H(t)i& the Fourier transform of A (to). Its narrowband transfer function is 

I'M = *>V^ r = „ 8/,(m)g "'" r (.0-40) 

i !/(«•-¥)! 



The estimate a n is obtained by sampling the complex envelope of the output of this 
filter at time t„ + 7" = «8 + T. If we place a rectifier after this narrowband filter, 
the presence of any one of the signals sj(t) is indicated by the crossing of a decision 
level 7 by the output of this rectifier. 

The complex envelope of the signal component of the output of the 
narrowband filter when the nth signal 

s n (t) = Rea„F(t -T fl )e /0 ' 

is present has the Fourier transform 

/ n (0, (w) = a„/i*(o))/(w)expH(o(T w + 7")] 



8|/(«)| 



k=~eo 



exp[-/oi(T„ + 7")]. 



When the arrival times t„ of the signals are closely spaced, W 8 « 1, this spectrum 
is nearly uniform over a band of width B _l hertz, outside of which it nearly vanishes. 
The signal at the output of the subsequent rectifier therefore has a width on the order 
of 8, and its peak value occurs at time t„ + T'. The filter whose narrowband transfer 
function is given by (10-40) sharpens the incoming signals to such an extent that its 
outputs due to those signals arriving at times separated by 8 do not significantly 
overlap. The filter also enhances the noise, but maximizes the output signal-to-noise 
ratio for each of these signals. The shorter the interval 8 between the signals that our 
receiver is required to resolve, the smaller is this maximum output signal-to-noise 
ratio 

and the more energy E each signal must carry in order to be detected with the 
specified reliability (Q , Q d ). 
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10.2 THE DETECTION OF SIGNALS IN CLUTTER 



10.2.1 The Spectrum of Clutter Interference 

A radar system is often called on to detect a target echo in the presence of a great 
many other echoes from raindrops or from the surface of the ground or the sea. 
Interference of this kind is known as clutter. In wartime, strips of metal foil known 
as "chaff' or "window" are dropped from airplanes to confuse enemy radar by 
creating a similar interference. The reverberation encountered in sonar is another 
type of clutter. The task of a radar subjected to clutter might be considered as the 
resolution of a wanted signal from a number of undesirable ones that overlap it in 
time and frequency. 

Because the parameters of the extraneous signals are unpredictable, however, 
it is more convenient to view the clutter as a type of noise. An apt model pictures 
it as composed of reflections of the transmitted pulse from a large number of small 
dispersed scatterers. Because the net voltage they produce at the receiver input is the 
sum of a large number of weak, random voltages, the clutter can be described by a 
Gaussian distribution. Methods developed earlier for detecting a signal in colored 
Gaussian noise can be applied to finding the optimum detection system, which in 
turn can be analyzed to determine the probability of detection as a function of 
the strength and distribution of the clutter. One of our purposes in investigating 
this model is to bring out the similarity of the resultant optimum detector to the 
maximum-likelihood detector derived in Sec. 10.1. 

This clutter noise is not stationary, for the density of scatterers within the 
transmitted beam varies with the distance, and the total power they reflect is not 
uniform. Because the density and other characteristics of the scatterers usually 
change only slightly over a distance on the order of several radar pulse lengths, 
however, the detectability of an echo signal in the midst of the clutter will be nearly 
the same as if the clutter had at all times the same statistical properties as in the 
vicinity of the signal. We can therefore treat the clutter as though it were stationary. 
As it is Gaussian and has zero expected value, its probability density functions 
depend only on its spectral density. 

If the scatterers are all at rest with respect to the transmitter, the composite 
echo signal takes the form Re S(t) exp /Hz, where 



Here F(t) is the complex envelope of the narrowband transmitted pulse, z„ a complex 
number specifying the amplitude and phase of the nth echo, and t„ the epoch of the 
nth echo. We suppose the envelope F(t) normalized as (10-2). The z n 's and the t„'s 
are random variables independent from one scattering to another. The infinitesimal 
echo pulses arrive at a high average rate and produce clutter with a finite average 
power equal, say, to P e . Noise of this kind has a spectral density proportional to the 
absolute square of the spectrum 



S(t) = X Z * F <' -T„). 



(10-41) 



n 
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of each pulse [Hel9I, p. 401], [Pap91, p. 360]. Hence the spectral density of the 
clutter is 

= HT(ia - ft) + (-a) - ft), (10-42) 

where ^(w).is the positive-frequency part of the spectral density as measured from 
the carrier frequency ft, and 

*(«) = {Pc\f(f*)\ 2 . (10-43) 

Usually the scatterers are not all at rest with respect to the transmitter. The 
transmitter may be on a moving airplane, and the scatterers may themselves be 
moving erratically, as when trees, bushes, and sea spray are blown about by the 
wind. In most cases the effect of such motions can be expressed by a convolution, 

= \ f 2?c(w)|/(« - w)\ 2 ~, (10-44) 

where Rc(w)dw/2ir is the power in the clutter reflected by scatterers inducing a 
frequency shift in an interval of width dw/2ir about the frequency w/2tt. The total 
clutter power is 

J-co 2ir 

For clutter due to reflections from windblown vegetation, the width of the distribu- 
tion R c (w) has been observed to be inversely proportional to the wavelength of the 
transmitted pulses. The product of this width and the wavelength is on the order 
of a few centimeters or tens of centimeters per second. Under certain circumstances 
the distribution of the clutter amplitudes departs from the Gaussian, but for our 
purposes we must disregard this. Many details of the characteristics of natural and 
artificial clutter are to be found in [Law50], [Geo52], [McG60], [Sko62, Ch. 12], 
and [Sch91b, Ch. 4]. Extensive treatments of detection in clutter are to be found in 
[Rih69, pp. 350-80] and [Sch91b]. 

The complex autocovariance function of the clutter is the Fourier trans- 
form of 2#X<d) as in (3-19). Because (10-44) has the form of a convolution, its 
Fourier transform is simply a product, 

*(t) = X(t, 0K(t), 

where 

X(t,0)= \* F{t-\>t)F*{t + \i)dt, 

J-co 

r e (T) = |% c ( w )^^, r e (0) = Pc 

Thus the complex autocovariance function of the clutter is the product of the am- 
biguity function \(t, 0) of the transmitted pulse and the Fourier transform of the 
density R c (w) of the frequency shifts. 

10.2.2 Detection of Single Pulses in Clutter 

Let the echo signal be represented by 

5(0 = A ReFs(t -<r)e /n ' + '*, 
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where A and <{> are its unknown amplitude and phase, t its epoch, and F s (t) 
its complex envelope, normalized as in (10-2). If the signal is a version of the 
transmitted pulse with the Doppler shift w, as we shall generally presume here, 
F s (i) - F(t) exp iwt, / v (o>) = /(w - w). 

We suppose that the receiver is working only with the return from a single 
transmitted pulse and is detecting echoes individually. The spreading of the spectral 
density given by (10-44) can then be neglected, for the distribution R c {w) is much 
narrower than the spectrum /(to) of most radar pulses. By adopting as our reference 
carrier frequency Cl that of the clutter echoes, we can write the spectral density of the 
clutter as in (10-42) and (10-43). In addition to the clutter, the input to the receiver 
will contain white noise of unilateral spectral density N. As we said in Sec. 3.2.4, the 
white noise can be treated as narrowband noise whose spectral density is uniform 
over the range of frequencies occupied by the signal. Hence the narrowband spectral 
density of the total input noise is 

<f>( W ) = f + f (uj) = i[JV + P e \f {<*)?]. (10-45) 

Because the arrival time of the signal is unknown, , we employ the maximum- 
likelihood detection strategy developed in Sec. 7.2. The input to the receiver is passed 
through a filter matched to a signal whose complex envelope is 

/• CO J 

(2(0= 



20(a)) N + P c \f{*)\ 2 ' 

where f s (a>) is the Fourier transform of the signal envelope F s (t); compare (2-77) 
through (2-79). The narrowband impulse response of the filter is 

(" q*(T' - tY < t < T', 
K(r) = \ U 1 h , (10-47) 

1 0, t < 0, t > V, 

where T' is a delay long enough for the interval < / < V to include most of 
the reversed signal Q*(T' - t). The output of this matched filter is rectified and 
compared with a decision level, which when surpassed indicates the presence of a 
signal. 

It would be difficult in practice to use this maximum-likelihood detector be- 
cause the clutter power P e is not constant, but varies in time through the variation 
with distance of the density of the random scatterers. Because this variation is small 
in a time on the order of a pulse width, one can envision a system using a time- 
varying filter so designed that at each instant it is matched to the signal Q{t), (10-46), 
for the current value of P c . Such a filter might be difficult to construct. An alter- 
native would be a set of many filters, each matched to Q(t) for a particular value 
of P c . The receiver would measure the clutter power independently and by switching 
arrange to use the filter matched for the nearest value of Pe - 
lf the delay V is very long, the complex narrowband transfer function Y{<a) 
of the matched filter, defined in Sec. 3.1, is 
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The spectrum of the echo pulse as it issues from the matched filter is 



fan) = „-/'uj( 7"'+t)+;'<() 

with frequencies still measured from the carrier frequency of the clutter. 

In the situation least favorable for detection the target moves with the same 
relative velocity as the scatterers, there is no relative frequency shift (w = 0), and 
/ v (o>) =/(w). The amplitude i/ (w)I of the output spectrum is then nearly uniform 
for angular frequencies lying within a band -u> r < w < w,, with the cutoff frequency 
o) c - given roughly by 

PAfMl 2 * N. 

[We suppose the pulse spectrum symmetrical, |/(w)| = l/(-w)|, for simplicity.] At 
frequencies far from the carrier, for which |w| » w c , the output spectrum drops off 
to zero. The stronger is the clutter power P c compared with the product of N and 
the bandwidth W of the signal, the larger the cutoff frequency o) ( .. The width of the 
output pulse F (t) from the matched filter will be on the order of 2tt/w ( . and much 
smaller than 2ir/W, the approximate width of the echo signal. Thus the effect of the 
matched filter in (10-47) is to sharpen the returning echo pulses as much as possible 
without too greatly enhancing the output resulting from the white noise. It is hoped 
that the signal echo will then stand out over the smaller echoes due to the clutter. 

The filter specified by (10-48) for a signal with w = is similar to the filters 
proposed by Urkowitz [Urk53], for which Y(a>) equals [/(a))]" 1 for |w| less than a 
fixed cutoff frequency, beyond which Y(u>) vanishes. The cutoff frequency is taken 
large enough to sharpen the signal pulses as much as possible, yet not so large 
that the white noise generates an excessive output. The filter in (10-48) also closely 
resembles the one derived in Sec. 10. 1 for distinguishing signals that might arrive at 
times separated by an interval 5 that is much smaller than the reciprocal W~ x of their 
bandwidth. The cutoiT frequency w c corresponds to the frequency tt/8 marked in 
Fig. 10-3; for |<oJ < ir/S, the transfer function Y(o>) of that filter, as given by (10-40), 
is roughly proportional to [/(w)]" 1 ; for [wf » -rr/&, l^(w)| « 0. For w r » W and 
w = 0, the spectrum f Q (ta) of the output of the filter in (10-48) closely resembles that 
of the filter in (10-40). 

The performance of a detection system of this kind is described by its false- 
alarm and detection probabilities, which can be calculated as in Sec. 7.4. They are 
approximately 

(10-49) 

G/*fi(<4ff,A), 

where Q( ■ , ■ ) is Marcum's Q function, b is proportional to the decision level, T is 
the length of the observation interval, is the effective signal-to-noise ratio, given 
by 

~ 2 *L*+p,i/mp m (10 - 50) 

and p is a bandwidth defined by 
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(2tt) 3 



co ^|/, (01 )P ^ (10-51) 



2tt 



co !/,<to)F ^ 
2-n 



([■{to) 



This bandwidth will exist only if |/i(a>)| 2 decreases to more rapidly than |o>|~ 3 as 
|w) — » co. In (10-50) £ = \A 2 is the signal energy, the normalization in (10-2) being 
maintained. 

For a fixed false-alarm probability, the probability of detection depends most 
strongly on the effective signal-to-noise ratio given by (10-50). If one alters the 
shape of the transmitted pulses or even the transmitted power, to which P c is propor- 
tional, the bandwidth p and hence also the quantity b in (10-49) must change. This 
variation in b, however, has a much smaller effect on the detection probability Q (/ 
than the accompanying change in the effective signal-to-noise ratio d^. 

Let us further analyze the detectability of an echo with the same carrier fre- 
quency as the clutter (w = 0). When P c » N&a, the integrand in (10-50) is nearly 
uniform for |a>j < io c , where again P e \f(ia t .)\ 2 « N. For M > w,. the integrand drops 
to zero. The effective signal-to-noise ratio d^ is therefore roughly given by 

2 2Ein c 

^ ~ ~^pV 

With both the signal energy E and the average clutter power P c proportional to the 
power output Pr of the transmitter, the signal-to-noise ratio dl$ depends on Pr 
only through the value of w c . If, for instance, |/(oi)| is proportional to for 
large values of |o>|, |/(to)t ~ C]|w|~", we see from the equation 

|C,| 2 ^Kr 2 " *N 

that oi c is proportional to P c l/2 " and hence to Pr 2 ". The effective signal-to-noise 
ratio d^ determining signal detectability through (10-49) is therefore also propor- 
tional to Pr 2 ". The larger n is, the smaller the influence of a mere increase of 
transmitter power on the probability of detection. 

In order to study in more detail the effect of increasing the transmitted power, 
let us write the effective signal-to-noise ratio (10-50) as 

, 2 _ 2EW P c C |/M 2 du> 



for a signal with no Doppler shift (w = 0). The factor (2EW/P C ) is independent of 
the transmitted power. In Fig. 10-4 we have plotted the factor -y versus the clutter- 
to-noise ratio r = P C /NW for four types of signal; W is the equivalent bandwidth 
defined in (7-11). The signals and their equivalent bandwidths W are listed in Table 
10-1. The signals have been normalized as in (10-2). 

For the bandlimited signal it is simple to show that 

= _J_ . = Pc_ 
7 1 + r ' ' NW ' 
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Figure 10-4. Signal-to-noise-ratio factor 7 (10-52) as a function of the clutter-to- 
noise ratio P C /NW for the signals listed in Table 10-1: (a) rectangular, (b) bilat- 
eral exponential, (c) Gaussian, (d) bandlimited pulse. 

and increasing the transmitted power, to which r is proportional, leads to satura- 
tion. The bandlimited signals Re F(t) exp iSlt have long tails and cannot easily be 
resolved. The more sharply the spectrum |/(a>)| decreases to zero as |a>| — 00, the 
less effectively the pulses can be resolved or detected in the midst of their own clut- 
ter. At the opposite extreme, rectangular pulses, as one might expect, are the most 
easily resolved and the most reliably detected in clutter noise. 

For the bilateral exponential signal, application of [Dwi61, eq. 856.31] or the 
residue theorem to (10-52) yields 



1 = 



s = 



r = 



NW 



(10-53) 



+s 2 y/*Re[(\ + isy/ 2 y 

For the other two types of signal the factor 7 was evaluated by numerical integration. 
For Gaussian signals the integral in (10-52) could be efficiently integrated by the 
trapezoidal rule. For rectangular signals that integral becomes 



_ 2r_ r 

IT Jo 



sin 2 x 



dx, C- rWT = %. 

2 



Jo x 2 + C sin 2 x 

Because of the slow decline of the integrand to zero as x -+ oo t it was necessary to 
reform the integrand somewhat drastically in order to reduce the time required by 
the numerical integration. The range from to 00 was broken into ranges of width 
tt, in each half of the kth of which the substitution 
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Table 10-1 Signal Parameters 



F(l) /M W 

7-1/2 ^s T <t < \ T> 2sin(^r) 3 

(a) Rectangular j ^ ^ , r w r»/2 2T 

2« 3/2 2 



(b) Bilateral exponential 

(c) Gaussian 



w 2 +«2 5" 

(2cr a /»)"V ,,V (27r/a 2 ) i/4 e--^ 2 tT^c 

2sin(Tr^Q ( HZ"'/ 2 , -ttH/ < o „, 

(d> Bandlimited- ^ ( Q< ^ 



JC = (it + i)lT + «, < fc < CO, 

was made, reducing the integral to 

1 



O,. /"it/2 °o 
Z/ 2 V" 

7 — — cos u > 

IT .0 



COS' 



[(/c + |)tt + uf + C COS 2 u 

1 



[(Jk + i)ir - w] 2 + C cos 2 
In order further to hasten the convergence, we subtracted the term 

2r C /2 2 A v 2 - r 

70 - — COS U du y — r~r r - r 

[Dwi61, eq. 48.12]. After rather much algebra, we wrote the integral as 

4r f u/2 , 
7 = 7o + — cos u G(u) du, 
n Jo 

- V ^(3» 2 - C cos 2 u) -(u 2 + C cos 2 a) 2 
<?(M) ~ £b **[(/* +uf + C cos 2 «][(/* - uf + C cos 2 «]' 

(* = (A: + i)ir, 

and this was subjected to a numerical integration routine in the computer. Execution 
was slow, and the results, plotted as curve (a) in Fig. 10-4, were at last obtained. 

When after reflection from a moving target the returning echo has a Doppler 
shift h' with respect to the transmitted carrier frequency ft, we must replace the 
numerator of the integrand in (10-52) by |/(a> - if)! 2 . For Gaussian pulses the result 
can be efficiently integrated by the trapezoidal rule, and in Fig. 10-5 we plot the 
resulting values of the parameter 7 versus r = P C /NW for six values of the Doppler 
shift parameter A = w/-j2^. The straight line marked 00 represents 7 = r as when 
the echo frequency has been shifted completely outside the spectral band of the 
clutter. The greater the Doppler shift, the more reliably the signals can be detected 
in the midst of the clutter produced by the transmitted pulse. 
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Figure 10-5. Signal-to-noise-ratio factor 7 for Doppler-shifted Gaussian signals as 
a function of the clutter-to-noise ratio P C /NW. Curves are indexed with the Dop- 
pler shift parameter A = w/yf%j*. The curve marked 00 corresponds to w = 00. 



When the distribution R c (w) of Doppler shifts in (10-44) is much narrower 
than the spectrum |/(to)| of the signal, the clutter is said to be underspread, and the 
foregoing analysis applies when the Doppler shift w of the target echo to be detected 
is zero. If on the other hand the Doppler shift w is large enough so that the signal 
spectrum ]f s (ta)\ = |/(co - w)\ lies outside the narrowband spectral density ^(w) of 
the clutter, the effective signal-to-noise ratio is the same as the input signal-to- 
noise ratio 2E/N. 

When the distribution R c (w) of Doppler shifts of the clutter echoes is much 
broader than the spectrum |/(to)| of the signal, the clutter is said to be overspread. 
If the Doppler-shifted spectrum j/(o) - w)\ of the target echo lies within the region 
where the clutter spectral density ^(co) is significantly large, the target echo is being 
sought in the midst of noise that is effectively white. The optimum filter is then 
matched to the signal itself. 

The similarity of the optimum filter (10-48) for detection of a signal in clutter 
to that derived in Sec. 10.1 for distinguishing close signals— see (10-40)— makes it 
plausible that the detection of a signal when other signals of the same form are 
potentially present can be treated as the detection of the same signal in the presence 
of Gaussian clutter, which consists of a dense succession of weak signals arriving 
at random times and with random amplitudes and phases. Resolving the wanted 
signal in the midst of all those randomly displaced nearby signals is much the same 
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as resolving it from a few strong signals in the same temporal neighborhood. The 
signals that might arrive close to the one to be detected have the same effect on the 
structure of the maximum-likelihood receiver as the randomly arriving signals that 
constitute clutter noise. 

If we extend this principle to cover interfering signals that differ in both arrival 
time t and Doppler shift w from the one to be detected, with the Doppler shift w 
anywhere in a band much broader than the signal spectrum I /(«)(, we are led to the 
conclusion that the maximum-likelihood receiver for this situation will be much like 
the maximum-likelihood receiver for delecting the signal in the midst of overspread 
clutter. The overspread clutter resembles an additional white Gaussian noise, and 
as we saw in Sec. 7.2, the maximum-likelihood receiver is closely approximated by 
one that consists of a filter matched to the signal itself, 

7 (to) =//(«) e~ ilaT \ /,(») =/(« ~ iiO, (10-54) 

followed by a rectifier; when the output of the rectifier exceeds a certain decision 
level, the receiver decides that a signal is present. Because a range of Doppler 
shifts ic is anticipated, the input must be passed through a bank of parallel filters 
matched to signals as in (10-54) for an array of carrier frequencies 0, + w uniformly 
spaced over that range. The resolvability of signals by a receiver of this type is then 
governed largely by the form of the ambiguity function \(t, w) to be studied in the 
next section, and it is principally by designing the shape of the transmitted signal 
Re F(t) exp /O/ so that the ambiguity function has the most favorable attainable 
form that good resolution is to be achieved over a broad range of signal arrival 
times and Doppler shifts. 

TO. 2. 3 Coherent Pulse Trains 



If the transmitted signal is a succession of coherent pulses E(t - AT, ) separated by 
a repetition period T,., 

CO 

F(i) = M' m £ E{t - kT,.), (10-55) 

A- = 

as in (6-118), its spectrum has the form 

/(<o) = -JM e{i»)e- iTM C(to), T - \(M - 1)7*,., (10-56) 
as in (6-120) and (6-121), where 

sin \AfT,.oi 

C(o)) = 2 ' (10-57) 
M sm ^ /, 03 

is the comb function, and <?(w) is the Fourier transform of the component pulses 
E{t). The magnitude |C(g>)| is plotted in Fig. 6-3 for M ~ 20. Then 

|/(o))| 2 = Afk(»)| 2 |C(«)| 2 . 

The narrowband spectral density ^(w) of the clutter will as in (10-44) be the 
result of convolving this with the distribution R ( (\v) of Doppler shifts induced by 
the motions of the scatterers. As we saw in Sec. 6.3.5, the width of the main peaks 
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of l/(to)l 2 is on the order of 2Tt/MT r ; and when a long pulse train is used, this 
may be smaller than the width of the scattering distribution R c (w). The resulting 
spectral density ^(to) will be a comb whose tines are separated in angular frequency 
by 2ir/T r . The width of each tine will be on the order of the sum of 2ir/MT r and 
the width of the distribution J? c (w), and in general this will be much smaller than 
the separation Itt/T,. The squared pulse spectrum |e(o>)| 2 "modulates" the entire 
density "^(w), which spans a range of angular frequencies on the order of A/ -1 , 
where A/ 2 is the mean-square duration of the component pulses E(t). 

The transfer function of the optimum filter for detecting a signal having Dop- 
pler shift w in the midst of this type of clutter has the form 

Y(*; w) = e~ iwT \ (10-58) 

itf + ¥(«) 

an evident modification of (10-46), and the effective signal-to-noise ratio governing 
the detectability of the signal is 



J-oo N + 29(a) 2n 



i-i 



as in (10-50). 

The receiver will consist of a bank of filters of the form in (10-58) tuned for a 
discrete set of Doppler shifts w throughout the expected range, and each is followed 
by a rectifier whose output is observed during the interval (T 1 , T + T') when target 
echoes are expected. Each such filter can be considered as a cascade of a filter whose 
transfer function is 

r ((o) = [\N + *(«)]- 

and a filter matched to the Doppler-shifted signal Re F(t)e' (a+>v) ' . Indeed, a single 
filter Jo(<*>) could precede a bank of filters matched to those signals. This filter 
Yo(m) attenuates all frequencies lying within the peaks of the clutter spectral density 
^(w), and signals arriving with Doppler shifts w falling into any of those peaks will 
be severely attenuated. 

When the transmitter is stationary with respect to the bulk of the scatterers, 
signals arriving with Doppler shifts w that are integral multiples of 2n/T r will suffer 
this attenuation. These Doppler shifts result from targets moving with velocities 
that are integral multiples of \^/T r , where X. is the wavelength of the transmitted 
radiation. These are called the blind velocities. For a transmitted frequency of 
3000 MHz, for instance, X. = 0.1 m; and if the pulse repetition period T r equals 10~ 3 
sec, the blind velocities are multiples of 50 m/sec ss 180 km/hr = 112 mph. 

Targets moving with such relative velocities that their Doppler shifts w differ 
from any of those multiples of 2ir/r r by more than the width of the peaks in the 
spectral density ^(w), on the other hand, will be detected with an effective signal- 
to-noise ratio d&. on the order of 2E/N. The transmission of coherent pulse trains 
thus enables the radar to take greater advantage of the motion of its targets relative 
to that of the clutter scatterers. At the same time, as we discussed in Sec. 6.3.5, it 
entails the risk of ambiguities in estimates of the arrival times t and the frequency 
shifts w of the returning echoes. 
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10.3 THE SPECIFICATION OF SIGNALS 



The normalization 



X(6 



1 0,3. 1 General Properties of the Ambiguity Function 

10.3.1.1 Definitions. Whenever it is necessary to distinguish or resolve two nar- 
rowband signals in the presence of white Gaussian noise, the structure of the receiver 
and its performance depend on their scalar product, or cross-correlation. When the 
complex envelopes of the signals have a common form and the signals differ only 
in certain nonrandom parameters e, the cross-correlation is termed the ambiguity 
function of the signals and is written 

Mei,e 2 )= f / , (/;eiXF*(/;e 2 )rf/. (10-59) 

J -co 
I- 00 

,0) = \F(t;Q)\ 2 dt = 1 (10-60) 

J— CO 

for all values of the parameters is customary. 

In radar the parameters chiefly serving to distinguish two echo signals are their 
arrival times t and the Doppler shifts w of their carrier frequencies from a common 
reference value. When the integration in (10-59) is, as usual, carried out over the 
infinite range, the ambiguity function for these parameters depends only on the 
differences of the epochs and frequencies of the two signals. If the epochs -t/2 and 
+t/2 and the carrier frequencies fi - w/2 and Q, + w/2 are assigned to the signals, 
their complex envelopes can be written as 

F(t; 4t, -\w) = F(t - £t) e'X'-K 

F(t; \i, \w) = F(t + iT)^'' ,,,( ' + H 

and the ambiguity function M~jt, \t, \w) becomes simply ' 

Ht> w) = F(t~ ^)F*(t + I T )e-*"' dt (10-61) 

J -co 

as in (6-96). Other definitions differing from this by a phase factor have been 
used, but because of its convenience the form in (10-61) has become standard. The 
ambiguity function takes on its peak value at the origin, and with the complex 
envelope F(t) normalized to 1 as in (10-60), 

IMt, w)| < MO, 0) = 1, 

as can be shown by means of the Schwarz inequality. 

If we introduce the spectrum /(id) of the complex envelope F(t), 



F(t)e~' mt dt, 



we find by substituting the inverse Fourier transform into (10-61) that the ambiguity 
function has much the same form in the frequency domain: 

Mt, iv) = /(« + iw)/*(ft> - \w)e~ im ^. (10-62) 



Sec. 10.3 The Specification of Signals 



379 



10.3.1.2 The ambiguity surface. The ambiguity function \(t, w) is in gen- 
eral complex, but the resolvability of two signals with a relative delay t and a fre- 
quency difference w depends only on its magnitude |\(t, h>)|, which it is advantageous 
to imagine plotted as the height of a surface over the (t, w)-plane. It can be called 
the ambiguity surface. This quantity |\{t, w)\ acquires further meaning if one con- 
siders a bank of parallel filters used to detect a signal of unknown arrival time and 
Doppler shift in the presence of white Gaussian noise, as described in Sec. 7.5. We 
insert a test signal Re F(t) exp itit at time / = and determine the resulting output 
from the filter matched to a signal Re F{t) exp i(Q, + w)t with frequency shift w. If 
-t denotes the time measured from the common delay V of the filters, the signal 
component of the rectified output of this filter is 



[Compare (6-99)]. If we suppose that the filters are matched for a dense set of 
frequencies ft + w, and if we picture their rectified responses R Y (T' - T ; w) to the 
signal Re F(t) exp iSlt plotted as a function of time t and frequency shift w, they 
will form a surface similar to the ambiguity surface. 

For every signal the ambiguity surface is peaked at the origin (0, 0) of the (t, w)- 
plane. A second signal arriving with separations t in time and w in frequency that lie 
under this central peak will be difficult to distinguish from the first signal. For many 
types of signal the ambiguity function |\(t, w)\ exhibits additional peaks elsewhere 
over the (t, vv)-plane. These sidelobes may conceal weak signals with arrival times 
and carrier frequencies far from those of the first signal. In a measurement of 
the arrival time and frequency of a single signal, the noise may cause one of the 
subsidiary peaks to appear higher than the main one, leading to gross errors in 
the result. The taller the sidelobes, the greater the probability of such errors, or 
ambiguities, in Doppler shift and signal epoch. It is desirable, therefore, for'the 
central peak of the ambiguity function to be narrow and for there to be as few and 
as low sidelobes as possible. 

10.3.1.3 Restrictions on the ambiguity function. If there existed a signal 
F (t) whose ambiguity function equaled 1 at t = 0, w = and zero everywhere else, 
it could be distinguished from another signal having the same form, but separated 
in time and frequency by displacements however small. The probability of error 
in resolving two such signals would be no greater than the false-alarm probability 
for detection. No such signal exists. Indeed, a function \(t, w) chosen arbitrarily 
will not necessarily be the ambiguity function of any signal. Even the magnitude 
1\(t, n> )) is not at a designer's disposal, but must satisfy certain conditions. 

An example of such a condition is the self-transform property of the squared 
magnitude |\(t, w)\ 2 , due to Siebert [Sie56]: 




2 




(10-63) 
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To find a function |\(t, vv)| possessing this property and having a form assuring good 
resolution besides is no easy task. Even if it can be found, one must still assign to 
it such a phase arg \(t, w) that the function -|\(t, w)\ arg \(t, w) = \(t, w) will be 
the ambiguity function of some signal F(t). Only when the proper phase is known 
as well can the Fourier transform of A(t, w) with respect to if be taken in order to 

° btai " -00 J 

\(t, w) e iM< ~ = F(u~ \t)F\u + \r), (10-64) 

J— 00 t-'R 

from which the signal envelope F{t) can be found— within an arbitrary constant 
phase factor — by setting u = \t, t = -t, and normalizing as in (10-60). Some fur- 
ther restrictions on the amplitude and phase of \(t, w) and on its real and imaginary 
parts have been reported by Stutt [Stu64]. 

An informative corollary of the self-transform property in (10-63) is derived by 
setting x ~ and y - 0: 

{00 /-CO J 
lMT,M/)| 2 rfT^ = I- (10-65) 

The total volume under the surface |\(t, h')I 2 must be equal to 2ir, no matter what 
the waveform of the signal. This condition prevents our making |\(t, w)\ small 
everywhere in the (t, ii>)-plane away from the origin. The magnitude ]X(t, w)| will 
always have a peak over the point (0, 0), and if we try to make that peak more slender, 
the values of |X(t, w)\ elsewhere in the (t, iv)-plane must rise in compensation. Much 
effort has been expended in searching for signals whose ambiguity function has a 
magnitude remaining below a specified level over as much of the (t, w)-plane as 
possible. Instructive pictures of ambiguity functions of a variety of signals are to be 
found in the book by Rihaczek [Rih69]. 

1 0.3.2 Single Pulses 

10.3.2.1 Amplitude-modulated signals. The behavior of the ambiguity function 
near its peak at the origin can be discovered by expanding the integrand of (10-61) 
into a double Taylor series in t and w. Putting 

F{t - \v) = F(t) - \F'{t)i + |F"(r)T 2 + <9(t 3 ) s 

= ] - i wt - + 0(»'V), 

substituting into (10-61), evaluating certain of the integrals in / by parts, and using 
the definitions of the signal moments in Sec. 6.3.3, we obtain finally 

\(t, w) - 1 - /wt - itw - 2(i) 2 t 2 - w7vi'T - \thv 2 + (10-66) 

and through quadratic terms the squared magnitude of the ambiguity function is 

)\(t 5 vv)] 2 « 1 - (Aa>V + 2A(wf)wT + A/V), (10-67) 

where Aco and A? are the rms bandwidth and duration of the signal, and 

A(uof) = w7 - w7 
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is its cross-moment of time and frequency. For small values of t and w the magnitude 
|X(t, w)[ is constant along contours similar to the uncertainty ellipse 

AwV + 2A{<at)wT + At 2 w 2 = 1. 

For the Gaussian pulse this similarity of the contours of constant magnitude 
holds for all values of t and w. A signal whose complex envelope is 

/a 2 \ l/A 

m=Z [^j exp H(« 2 " ft)* 2 ] 00-68) 

has an amplitude of the Gaussian shape proportional to exp(-^ 2 * 2 ) and an instan- 
taneous frequency increasing linearly with time: <f>'(r) = bt; see (3-2). For this signal 
the magnitude of the ambiguity function is 

jw m_ F (« 4 + *V + 2b™ + w 2 1 

IMt, w » - ex P : ^2 • ( 10 "69) 

This function is constant along elliptical contours of the form 

(a 4 + b 2 )7 2 + 2brw + w 2 = 4a 2 p. 2 , 

which are similar to the uncertainty ellipse. The area of each contour is equal to 
4irji>-, which is independent of the rate b of change of the instantaneous frequency. 
The effect of this linear frequency modulation on the pulse is only to rotate or shear 
the elliptical contours |X(t, w)\ = constant without changing their area. An improve- 
ment in resolvability in one region due to the frequency modulation is accompanied 
by a deterioration in some other region. 

In applications where negligible Doppler shifts are expected, it is only the 
behavior of the ambiguity function X(t, 0) along the r-axis that is important, and 
this can be improved by making the rate b much greater than a 2 . An advantage of 
the Gaussian signal is the absence of subsidiary peaks from its ambiguity function, 
which much reduces the risk of large errors in measuring its epoch and its frequency. 
Fowle et al. have extensively treated the generation and detection of the frequency- 
modulated Gaussian signal [Fow63]. 

10.3.2.2 Chirp modulation. Linear frequency modulation found one of its 
first applications to the improvement of range resolution in the design of the "chirp" 
radar [Kla60a]. This radar transmits a rectangular pulse with a quadratic phase, 

F(t) i a , i ^ 2 (10-70) 

and the total phase change \bT 2 from beginning to end is very large. The ambiguity 
function of this signal is 

\ 2sm{\(bv + w){T-h\j\ 
Mf, w) = • w)T ' m S ' (10-71) 

0, ItI > T. 
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Resolution in range only is governed by the values along the T-axis, 

\(t,o) - — ^ — |t| « r, 

a function that has a narrow peak at the origin t = 0. The width of this peak, 
measured between the first zeros, is (2ix/bT 2 )T, which when bT 2 » 2ir is much 
smaller than the duration T of the original signal. For diagrams of this ambiguity 
function, see [Rih69, pp. 173-5]. 

This function \(t, 0) represents the output of a filter matched to the sig- 
nal Re F(t) cxp i£lt when only that signal is fed into it. Because it is much nar- 
rower when bT 2 » 2tt than the envelope of the signal itself, the matched filter is 
said to compress the pulse. Radars transmitting such frequency-modulated signals 
and receiving them with matched filters are called pulse-compression radars [Mor88, 
pp. 123-55]. The danger of high-voltage breakdown in the output circuitry of the 
transmitter limits the peak amplitude of a radar pulse, and the pulse-compression 
radar can send out signals of much greater total energy than one that simply pro- 
duces a narrow amplitude-modulated pulse of the same rms bandwidth. 

For certain combinations of delay t and frequency shift w, however, the resolv- 
ability of the chirp signal will be no better than that of an unmodulated rectangular 
pulse of the same duration T and the same energy. Along the line w + bi ~ 0, the 
ambiguity function of the chirp signal is 

Mt,-*t)= -r~< ,Tl - r ' 
I o, |t| > r, 

which is the same as the function X(t, 0) for the unmodulated square pulse. As with 
the Gaussian signal, the frequency modulation displaces part of the volume under 
the ambiguity surface to a different part of the (t, u')-plane. A Doppler-shifted chirp 
signal can only with difficulty be distinguished from one that is merely delayed. 

Multiplication of the complex envelope of an arbitrary signal by exp \ibt 2 
shears its ambiguity function. If we define a new signal with complex envelope 

F,(f) = F(t)e^"\ 



its ambiguity function is 



M(T, II') = 



F{( ~ \r)F*{t + \r) exp [\ib(t - \rf - \jb(t + \tf - hvf] 
F{t - \t)F*{t + \i)e~ ihT, -" y ' dt = \(t, »* + b-r). 

J— no 



dt 



If Re F(t) cxp iW is an amplitude-modulated signal, the principal axes of its ambi- 
guity function stand at right angles to each other; about these axes the ambiguity 
function is symmetrical. For the "chirp-modulated" signal the axis of symmetry 
originally along the line »* - lies instead along the slanting line >i» = ~6t. 

10.3.2.3 Hermite signals. An instructive generalization of the Gaussian sig- 
nal is the Hermite waveform, whose complex envelope is 
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h„(aty/2) 



for any positive integer n, where h n {x) is the Hermite polynomial defined in Sec. 5.1.1. 
These signals have an oscillatory amplitude modulation, which changes sign « times 
before finally decaying to zero. Klauder [Kla60b] and Wilcox [Wil60] showed that 
the ambiguity function of this signal is 

\(t, w) = e~ r2/A L n {\r 2 ), r 1 = a 2 i 2 + a~ 2 w\ 

where L„(x) is the Kth Laguerre polynomial, defined in the Appendix, Sec. C.l. 
Around the central peak |X(t, w)\ has elliptical ridges whose heights decrease from 
the center; between them are elliptical contours on which \(t, w) ~ 0. Diagrams are 
to be found in [Kla60b]. 

As n increases, the central peak of |\(t, w)\ becomes narrower and narrower, 
but at the same time the rms bandwidth and the rms duration of the signals increase. 
Asymptotically for n » 1, 

\(t, w) « J Q (r-J2n -+ I), r = [aV + a _ V] 1/2 , < r < 2V2« + 1 

[Erd53, vol. 2, p. 199, eq. 10.15(2)]. For large n, \(t, w) vanishes for values of t and 
w on the ellipses 

[{n + I)(a 2 T 2 + a~ 2 w 2 )f /2 w 1.70, 3.90, 6.12, ... . 

The first elliptical ridge surrounding the central peak has a height of about 0.4 and 
the one next to it a height of about 0.3. Far from the center the function is roughly 
sinusoidal, but with a slowly decreasing amplitude, 

r 2 1 1/4 , 

Mt, w) * ; i— -r-r cos(rV27TT - lit), r = [aV + a~ 2 w 2 ] 1/2 . 

l(n + j)-n 2 r 2 J 

After n zeros there is a final ridge, beyond which \(t, w) drops to zero. These ridges 
in |X(t, w)\ render the signals liable to ambiguity in time and frequency. 

These Hermite signals have mean-square bandwidths and durations given by 

At 2 = (« + \)a~ 2 , Aa) 2 = (n + \)a 2 . 

The central peak of their ambiguity function covers an area of about 9,1/(Ag»At) 
for « » 1. The area of the (t, M>)-plane occupied by the entire ambiguity function, 
out to where it begins its final exponential descent to zero, is on the order of the 
product AwAf. 

10.3.2.4 Moments of the squared ambiguity function. For all amplitude- 
modulated signals the area covered by the central peak is on the order of (Ao>AO~', 
as is evident from (10-67). The part of the ambiguity function significantly greater 
than zero covers an area on the order of AtoA/, as can be deduced from the relations 
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if f t 2 |X(t, H -)P rfT ^ = A, 2 , 

^J-ooJ-oo 2lT 

- tw|X(t, iv)| 2 i/T = -A(ojO, (10-72) 

\C Cw 2 l\(<r,w)\ 2 d>T~ - Ac 2 . 

These can be derived by making a power-series expansion of the exponential in 
the integrand of (10-63) and applying (10-67) to the right side. For an amplitude- 
modulated pulse, for which A(toO = 0, the rms widths of the ambiguity function in 
t and w are on the same order of magnitude as those of the signal itself in time 
and frequency. The product of these widths crudely measures the area of the entire 
ambiguity function. 

10.3.2.5 A conjectural ambiguity function. A function of t and w that well 
illustrates these properties was proposed by Charles Persons: 

1 r t~ w 2 1 (10-73) 

exp 



1 + WT 



r t 2 i 

\ 2T 2 2W 2 \ 



Whether this is the absolute square of the ambiguity function of any signal, £(t, w) = 
|\(t, w)\ 2 , is unknown; it does satisfy the self-transform relation (10-63). In Fig. 10-6 
we have sketched the cross sections of L(t, w) along the t- and w-axes for WT » 1. 
The mean-square duration and the mean-square bandwidth of whatever signal might 
possess such an ambiguity function can be shown by (10-67) to be 

. 2 W^T 7 - + 1 - W 3 T 2 + 1 



2r 2 (i + wry iw 2 (\ + wry 

and one can show that 



Aw 2 A/ 2 = 



(^ 3 T 3 + I) 2 



W 2 T 2 (1 + WT) 1 ' 



As is required for any amplitude-modulated signal, Ato 2 A* 2 > J. For WT » 1, 
Au> 2 * ±W 2 and A/ 2 ~ \T 2 . Near the central peak the first term of (10-73) then 
dominates; far from it the second dominates. The central peak covers an area on 
the order of 2^/WT, and the broad skirt represented by the second term in (10-73) 
covers an area of the order of 2ir WT. 

In order for amplitude-modulated signals to provide good overall resolution, 
they must have a large time-bandwidth product Af Aco. When Af Aw » 1, the central 
peak of the ambiguity surface will be slender; and as the rest of the function takes 
up an area on the order of AfAio, [\(t, w)| must attain rather large values outside 
the center in order to meet the volume constraint given by (10-65). The average level 
of |Mj, w)\ in that region will be on the order of (Ar Ato)~ 1/2 . 
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Figure 16-6. Conjectural squared ambiguity function (10-73). 



It does not suffice, however, to make A/Aw large by introducing frequency 
modulation. As Rihaczek [Rih65] emphasized, the areas in question then involve 
not ArAo), but 

{Aw 2 A/ 2 - [A(a>/)] 2 }^ 

which also appears in the variances of the errors incurred in simultaneously measur- 
ing arrival time and frequency; see (6-114). As we saw in connection with (6-117), 
linear frequency modulation does not change this quantity at all; its only effect is to 
rotate or shear the ambiguity surface with respect to the t- and w-axes. If signals 
might arrive with time and frequency separations anywhere in a broad area of the 
(t, w)-plane, no overall improvement of their resolution can be achieved in this way. 

10.3.3 Pulse Trains 



10.3.3.1 The ambiguity function of a uniform train. Thus far we have imagined 
the signal F(t) as consisting of a single pulse. Certain radars, however, transmit 
a sequence of coherent pulses, constraining their phases to have a definite, known 
relationship to each other. Let us suppose that a train of M such coherent pulses 
E(t) of equal amplitudes is received from a target, and let us study the form of its 
ambiguity function w). The complex envelope of this signal is given in (10-55). 
We again assume that successive pulses overlap to a negligible degree; T r is the 
repetition period between the pulses, and the composite signal is normalized to 1 as 
in (10-60). 

To calculate the ambiguity function of a train of nonoverlapping pulses, we 
use the form in (10-62), writing the spectrum of the pulse train as 

<?(o>) is the Fourier transform of the component pulses E(t). The ambiguity function 
is then 
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1 f 00 



2 w-l w-l 



A/-1 M-l ^ 

^ J] exp[-y(o> + Jw)^ + ifc(a> - \w)T r — /cot] — 
y=o a-=o 

J— 00 



2ir 



7=0 & = 

where as in (10-62) \ (t, iv) is the ambiguity function of the individual pulses. 

Suppose that (p - \)T r < r < (p + j)T r , p > 0. Then because the pulses do 
not overlap, the only terms contributing to the double sum are those with j — k - p, 
and 



M~\-p 
y = 



Now 



*=0 



—ikx 



- £ 



(p-4)I> <T<(p + ^)T r . 



sin iM'* 



sin 



for ^ = 0, this sum equals A/'. We then obtain 



sin UM - \p\)wT r 



(10-74) 



Msin \wT r 
-(M-i)T, <T<(M-0r,. 
For |tl > MT r , the shifted pulse trains do not overlap, and X(t, w) = 0. In particular, 



\k(pT r ,Q)\ = 1- 



-M <p < M. 



(10-75) 



Along the -r-axis (vv = 0) the ambiguity function consists of repetitions of the 
function iXo(r, w)\ for the component pulses, with (M - 1) peaks on one side of the 
origin, {M — 1) on the other, and one in the center. The heights of these peaks 
decrease to each side, as shown by (10-75) and as illustrated in Fig. 6-2. The reason 
for this behavior is easily seen. Sets of pulses received separated in time by small 
multiples of the period T r overlap, except for the pulses at the beginning of one train 
and at the end of the other. To resolve these composite signals, a receiver must use 
those pulses that do not overlap; and the more of them there are — the larger the 
index p — , the more reliable is the resolution of the signal trains. 

In the frequency (w) direction the width of the peak of the ambiguity function 
!\ (t, w)\ of the component pulses E{t) is on the order of (A 0~', where Ao? is the 
rms duration of E(t). As we have taken T r » A /> (A /)~ 5 » T~ l . The factor with 
the sines multiplying \Ko(r - pT r , w)\ in (10-74) is the comb function we introduced 
in (6-121). It breaks up this peak of \\q{t - pT r , w)\ into a succession of narrower 
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peaks having widths on the order of 2tt/(M - \p\)T r and spaced by 2ir/T r . These 
peaks resemble the amplitude pattern of light reflected from a diffraction grating. 

We fix our attention on the central set of peaks {p =0) of the function 
|\(t, w)\, for which the multiplying factor is 



[C(w)| = 



sinU MwT r ) 



This factor, a portion of which is sketched in Fig. 6-3, reaches a value of 1 whenever 
w - 2k-n/T r ,k an integer, producing a peak whose width is on the order of 2tr/MT r . 
Between the tall peaks are a number of ripples whose height is lower by a factor 
M~ l . This "diffraction pattern" is superimposed on the original function |\ (t, w)\, 
breaking it up into many narrow peaks of width 2^/MT r and period 2-n/T r , both 
of which are much smaller than the width (A t)~ l of \ (t, w) in the w direction. 
The rms duration of the signal is now on the order of At « MT r , and the area of 
the (t, w)-plane covered by the central peak of the ambiguity function is again on 
the order of (At Ao>) _1 . The entire ambiguity function occupies a total area of about 
A* Aw. 

The breaking up of |\o(t, w)\ into peaks and valleys in the w direction indicates 
that trains of pulses can be more effectively resolved in frequency than single pulses 
of the same total energy. This can be understood by observing that the coherent 
repetition M times of the pulse E(t) causes its spectrum e(o>) to divide into a line 
spectrum. The lines are separated by 2tt/7>, and their widths are about 2ir/MT r . 
If the Doppler shifts due to the motions of the radar targets are such that these 
line spectra for the echoes interlace, filters can be constructed to resolve the signals 
with high probability. If the relative velocity of the targets is such that the shift w 
is an integral multiple of the repetition frequency, w = 2k^/T r , on the other hand, 
the line spectra will overlap and resolution will be difficult unless the signals are far 
enough apart in time. 

The measurement of the velocity of an isolated target, by estimating the Dop- 
pler shift of its radar echoes as described in Sec. 6.3, can be made more accu- 
rately if one utilizes a train of coherent pulses in place of a single pulse. The 
behavior of the function |\(t, w)\ indicates, however, that ambiguity will be in- 
troduced into the results, frequencies differing by multiples of 2tt/7> becoming 
indistinguishable. For a repetition period T r - 10~ 3 sec and a carrier frequency 
Q,q = 2ir ■ 3 ■ 10 9 sec"" 1 (3000 MHz), the ambiguity in target velocity amounts to 
Ay = c/[2(a /2ir)r r ] = 180 km/hr = 112 mph; c is the velocity of light. 



10.3.3.2 The clear area. If it is only the resolution of very close targets that 
is of concern, the best signal is one whose ambiguity function |\(t, iv)| has as slender 
a central peak as possible, outside of which the function must take on the lowest 
possible values. We now know that a long train of narrow pulses has these properties. 
The area A c occupied by the central peak is inversely proportional to the number 
M of pulses in the train, 



MT ( . A (i>' 
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where T y is the pulse repetition period and where AouT 1 , the reciprocal of the rms 
bandwidth of the component pulses, measures the width of their ambiguity function 
|X(t, ti')l along the -r-axis. The level of the ambiguity function |\(t, w)\ between the 
central peak and the adjacent peaks is on the order of M~\ and by taking M large 
enough, both this level and the area A c can be made as small as desired. 

The peaks nearest to the centra! one are separated from it by T r along the T-axis 
and by 2tt/T,. along the iv-axis. The area of the (t, w)-plane over which |\(t, ik)I 
can be made arbitrarily small is, therefore, on the order of (2T , i .)(4ir/r,.) = 8ir. The 
question whether this "clear area" can be made any broader by judicious choice of 
the signal waveform F(t) has been answered in the negative by Price and Hofs tetter 
[Pri65], who worked out bounds on the size of clear areas of various shapes. 

10.3.3.3 Polyphase-coded pulse trains. With a long uniform pulse train 
the (t, iv)-plane is studded with a great many narrow peaks whose heights near the 
origin are almost equal to 1, and ambiguities abound. One way to suppress them is 
to vary the relative phases of the pulses. The signal in (10-55) is replaced by 

M-\ M~\ 



F{t) = V Qk E{t - kT r \ y lajtl 2 = M, 



(10-76) 



with the component pulses E{t) normalized as before and assumed not to overlap. 
The sequence fl > «i> , flw-i is often called a code. The amplitude factors a k are 
complex; when only the relative phases are being altered, their absolute values \a k \ 
equal 1. In this way the transmitted power is uniform throughout the sequence, 
and the pulse train carries the greatest total energy permitted by voltage constraints 
imposed by the necessity of avoiding breakdown in the antenna feedlines. The 
sequences {a k } are then called polyphase codes. 

Instead of (10-74) one now finds for the ambiguity function of the puise train 

M-l 

IMt, ii')! = X IMt - P T r , w)\ C p {wT r ), 



M-\p\-] 

I 

A- = 



(10-77) 



-M <p < M, C (0) = 1. 



1 

M 



The function C p (x) is called the discrete ambiguity function [Ger9I], 

If no Doppler shifts w greater than a fraction of 2-n/T r are expected, the 
designer needs to be concerned mainly with the behavior of the M quantities 

M-\i>\-\ 

X a k a k*\ P \ i 

k = Q 

which represent the correlation of the sequence a k with itself, "correlation" being 
taken, of course, not in the statistical sense. For the sake of comparison, keep in 
mind that for the uniform pulse train with a k = 1 

C p (0) = \ [ M ' 

10, \p\>M. 



\p\ < M, 
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The simplest choices for the a k 's are the numbers +1 and -1. In discussing the 
synchronization of long trains of pulses in a binary communication system, Barker 
[Bar53] recommended the use of sequences of M positive and negative pulses for 
whose amplitudes 

C,(0) - 1, C„(0) <jj, P ±0, 

and he exhibited a number of such sequences. One of length 5 having this property 
is +1, +1, +1,-1, +1, for which 

C (0) = I, Q(0) - 0, C 2 (0) = I C 3 (0) = 0, C 4 (0) = f 

It is difficult to find long sequences with such an advantageous autocorrelation. In 
fact, Turyn and Storer [Tur61] showed that there exist no Barker codes of odd length 
M greater than 13. The Barker code of length 13 is 

+ 1, +1, +1, +1, +1, -1, -1, +1, +1, -1, +1,-1, +1. (10-78) 

Golomb and Scholtz [Gol65] studied "generalized Barker sequences" in which the 
a k 's are the nth roots of 1, that is, powers of exp Itti/n for integers n. They tabulated 
a number of these for various values of n and M, and for small values of M they 
stated for which integers n generalized Barker sequences exist. 

Another approach to the design of effective coherent pulse trains considers 
periodic repetitions of polyphase codes. The periodic correlation of a periodically 
repeated sequence ao,a\, ... , a.M~\, Qk+M - a k, is defined as 

M-\ M-\-k M-\ 

Sk = J aja? +k = ^ a J a J+k + X aja*+k-M, < k < M. (10-79) 

j=0 j=0 j=M-k 

Techniques have been developed for determining sequences {a k } such that s k = 0, 1 < 
fc < m — the "perfect periodic codes" — or such that \su \ is on the order of \/M for 
M » 1, I < k < M, — the "asymptotically perfect periodic codes" [Hei61], [Fra62], 
[Chu72], [Fra80], [Lew82], [Ger91]. Experience has shown that the correlations C p (0) 
determining the resolution of signals with zero relative frequency shift (w = 0) are 
then generally small for 1 < p < M, so that the resulting signals (10-76) will be 
effective for radar detection [Fra63]. The practical aspects of utilizing such codes in 
radar are considered in [Lew81j. 

If Doppler shifts w much larger than 2ir/7 T r may occur, one must investigate 
the functions C p (x) for all values of x in the interval (0, 2ir), which is their basic 
period. The pattern C p (0) of peak heights along the T-axis will be repeated along all 
lines w = 2itm/T,. parallel to the T-axis for positive and negative integers m. Even 
with Barker sequences, C p {x) rises much above the level M~ l for values of x between 
and 2ir. Indeed 

Af-l fin J y 
p=-{M-\) JU 
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Figure 10-7. Function C,,{.v) in (10-77) for the Barker sequence with M = K-; 
lbr/)-0, 1, 3, 6, 9. Curves are indexed with the value of />. Only the segment 
< .v < Ti is shown. The function h symmetrical about ,\- - and periodic with 
period 2ir. 



which indicates that C p (x) can be expected to reach heights on the order of M^ n 
on the average. In Fig. 10-7 we exhibit the function C p (x) for the Barker sequence 
with M - 13 for a few values of p. Although small for x - and p > and along 
the x -axis away from the origin for;; = 0, it takes on large values elsewhere. For an 
illustration of the resulting ambiguity function, see [Rib69, p. 217]. Diagrams of the 
discrete ambiguity function for polyphase codes with large numbers M of elements 
are to be found in [Kre83], [Lew86] 5 and [Ger91]; they exhibit similar behavior. 
Signals with frequency shifts w such that x = wT r falls into the region where C p {x) 
is large will be difficult to detect in the presence of signals with zero frequency shift. 

Other sequences of amplitudes a k investigated include trains of +l's and -l's 
thai can be generated by a binary shift register, particularly sequences of maximal 
length 2" - 1, where n is the number of stages in the shift register [Zic59]. A bibliog- 
raphy of early studies of this problem was drawn up by Lerner [Ler63]. In general, 
there is no way of finding a sequence {a k } whose functions C p {x) will be small every- 
where in —it < x < Ti, < p < M, except at the origin (p =0, .v - 0). The most 
one can usually do is to try a set of a k 's, compute C p (x) for p = 0, 1, ... , M - 1 at 
a number of values of x in (0, 2ir) ? and see what it looks like. 

The artifice of staggering both the epochs and the frequencies of the component 
pulses of the train was treated by Rihaczek [Rih64]; see also [Rih69, pp. 308-12]. 
The usual result is that the clear area becomes filled, and the ambiguity function 
outside the central peak takes on a jagged structure with an average level on the 
order of (A/Aa})" i/2 . Overall resolution deteriorates. By reducing the heights of the 
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outstanding peaks away from the origin, however, the risk of ambiguities — large 
errors in the measurement of carrier frequency and arrival time — is diminished. 



Problems 



10-1. In the context of Sec. 10.1.3, assume that the arrival time of the signals is known, 
but that they may have any of M carrier frequencies ft + k Aw, < k < M, uniformly 
separated by Ac*, which is on the order of the reciprocal (A?)"' of the rms duration 
of the signal. Then the complex envelope of the kth signal is 



in which F(t) is given. As in Sec. 10.1.3, the signals may arrive with arbitrary ampli- 
tudes and with phases uniformly distributed over (0, 2tt). Some or all of them may be 
absent. 

Determine the elements of the ambiguity matrix A, and assuming that M is 
so large that the inverse A~' can be taken also to have the Toeplitz form, determine 
the maximum-likelihood receiver of these signals and evaluate the false-alarm and 
detection probabilities for the test of hypothesis Hk, "The kth signal is present," 
versus hypothesis Hi, "The kth signal is absent." Explain how the reliability (Q , Q</) 
of this test depends on the relation of the frequency interval Aw to the duration of the 
signal. Assume that the observation interval (0, T) is long enough to encompass the 
entire signal and that as usual the noise is white and Gaussian. 
10-2. For a radar transmitted pulse of the form 



calculate the signal-to-noise ratio d 2 and the bandwidth £ for detection in clutter for 

w ~ 0; use the formulas in Sec. 10.2.2. 
10-3. What is the dependence of the bandwidth (5 in (10-51) on the transmitted power P ( 

when P » iVAw? 
10-4. Calculate the ambiguity function X(r, w) for the signal 



10-5. Show that for two identical signals of equal energies arriving in white Gaussian noise 
at known times separated by t, the signal energy required to decide with given error 
probabilities which signal is present is proportional to (A<o 2 t 2 )*" 1 for small values of t. 

10-6. Verify (10-53). 

10-7. Prove Siebert's self-transform property, (10-63). 

10-8. Derive (10-72) from the self-transform property and (10-67). 

10-9. Derive and sketch the absolute value of the Fourier transform /(<*>) of the chirp signal 
(10-70) for bT 2 « 1 and for bT 2 » 1. Verify (10-71). Why is (10-67) invalid for this 
signal? Hint: Use the Cornu spiral (Jah45, p. 37]. 
10-10. Define the cross-ambiguity function X]2(t, w) for signals with complex envelopes F\(t) 
and F 2 (t) by 



Fin o;r 



') = F(t)exp &Aw/, 



F(t)*yfce-»'U(t). 




Prove the relation 




= X, 3 (* - S, y + it,)KUx + 8, y - ft) 
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[Tit66]. Show how to obtain (10-63) from this. What does the above relation become 
for Fi{t) = F 2 (0 - F 3 {t) = F 4 (t) = F(t) when x = y ~ 0? 
10-11. For the 13-element Barker sequence in (10-78) calculate the correlations C,,(0) and 
C,(Tr) by (10-77). 

10-12. The Frank code has N 1 elements for any positive integer N [Fra62]. One forms an 
N x N matrix of which the /r/w-element is exp[2v i(k - \){m ~ \)/N\ The Frank code 
is composed by writing down the N rows of this matrix, one after another. Write out 
the code for N = 4 and calculate the autocorrelation C,,(0), -16 < p < 16, for it. If 
you have a computer with a fast Fourier transform algorithm in it, evaluate and plot 
the discrete ambiguity function C p (x) of (10-77) for a number of values of p between 
and 15. 
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11.1 STRUCTURE OF THE RECEIVER 

11.1ml Typos of Stochastic Signals 

Thus far we have presumed that the receiver knows the form of the signals to be 
detected and may be ignorant only of certain parameters such as amplitude, phase, 
and time of arrival. It is sometimes impossible, however, to specify the detailed 
structure of the signal, which may differ from one instance to another. The designer 
may then have to imagine the signals to have been drawn from an ensemble of 
random processes with certain statistical properties. Such signals are known as 
stochastic signals. Although their waveforms are usually complicated, it is not their 
complexity, but the unpredictability of their precise configurations that places them 
in this category. 

Stochastic signals may have been generated in a random manner or, originally 
possessing a definite form, may have been erratically distorted en route to the re- 
ceiver. A system for transmitting binary digits, for instance, might send a burst of 
random noise of fixed duration to represent each 1, with blank intervals standing 
for the O's. The signals might have the form 

s(t) = ReM(t)Z(t)e iSlt , (11-1) 

where M(t) is a fixed modulation, XI the carrier frequency, and Z(t) the complex 
envelope of a stationary random process of known complex autocovariance function 
<Kt): 

\E[Z{h)Z*{t 2 )] = Wi ~ h), £[Z(/,)Z(/ 2 )] a 0. 
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(Henceforth we omit the tildes that in Chapter 3 distinguished complex autocdvari- 
ance functions and narrowband spectral densities.) The complex autocovariance 
function of the signal envelope S(t) is then 

<M'i, h) = ^E[S(ti)S"(t 2 )] = M(!0M(h)4>(ii - t 2 ). (11-2) 

The jamming signals transmitted to incommode an enemy radar are sometimes of 
this nature; they can be generated by amplifying the output of noisy gas-discharge 
tubes. The signals that radio telescopes pick up from distant parts of the universe 
are stochastic and usually stationary for relatively long periods of time. 

Scatter-multipath communication systems link stations far beyond each other's 
horizons by emitting signals in such a direction that they will be reflected from the 
ionosphere. From each determinate transmitted pulse there arrive a large number 
of weak signals that have traveled paths of slightly different lengths, along which 
they have suffered a variety of attenuations and distortions. The sum of all these 
signals strongly resembles a stochastic process [Pri56], [Pri58], [Bcl63]. In radar 
astronomy the signals are reflected from a planet or satellite at a large' number of 
scattering points, and the combination of all the echoes again creates a stochastic 
signal [Pri60]. 

When each transmitted pulse Ref(r)exp/ft/ is reflected without distortion 
from a multitude of moving scatterers that introduce Doppler shifts w m and are so 
located that the total delays between transmitter and receiver are t>, the received 
signal is Re S(t) exp iQ.t, and its complex envelope is 

S(t) = £ z koi F{t - tjc) exp iw„,t. 

Here z kll , is a complex number representing the amplitude and phase of the signal 
with delay t a and shift \v m . The complex autocovariance function of the received 
signal is 

<M'i, h) ~ \ X X EtekmZk'm'Wit] - Ti<)F*(t 2 - tv) exp(fiv m *l - ntV'2), 

k,m k'jn' 

where E denotes an expected value with respect to the ensemble of scatterings. If 
separate scatterings are assumed statistically independent, 

<M*i, h) = jX E(k A „,lV(fi - ^)F*(t 2 ~ T k ) exp iw m (t\ - h). 

k.m 

When the scatterers are small and dense, this sum can be written as an integral by 
introducing a function <t(t, w) defined by 

a(T,v,)^^ = ^][E(|zU 2 ), 

in which the summation is taken over those scatterers resulting in a delay between t 
and t + dt and a frequency shift between w/2-rr and (w + dw)/2i:. Then 

4>*('l>'2) 

r 00 f DO i 

ct(t, w)F(t l ~ t)F*([ 2 - t) exp iw(ti - h) dr ~. (1 *" 3) 

J-ooJ-oo ZTT 

An example of this type of signal is the clutter noise we discussed in Sec. 10.2. 
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If we put 

we can write the complex autocovariance function of the received signal as 

<M'i, h) = f »Kt, *i ~ t 2 )F{t { - i)F % (t 2 - t) rf T . (1 1-4) 

J— 00 

Autocovariance functions of this general form were assigned by Price and Green 
[Pri60] to the echoes expected in radar astronomy. If i|>(t, as a function of t is 
significant over a range of values of t much longer than the pulse F(t), the target is 
said to be deep fluctuating; if v|>(t s = *KO&(t - t ), it is termed a fluctuating point 
target, which is in effect much thinner than the incident signal. 

The autocovariance functions in (11-2), (11-3), and (1 1-4) exemplify those char- 
acterizing different kinds of stochastic signals. We shall assume furthermore that the 
signals are realizations of Gaussian processes of expected value 0, taking the signals 
and noise to be quasiharmonic and the processes in question to be of the circular 
Gaussian type described in Sec. 3.2.3. The joint probability density function of any 
set of samples of their complex envelopes taken at arbitrary times has a circular 
Gaussian form like that in (3-40). 

The stochastic signals, when present, are received in the midst of Gaussian 
noise, to which we attribute a complex autocovariance function ^ Q (t u t 2 ). For sta- 
tionary noise <M>i, t 2 ) is a function only of t\ - t 2 . If the noise is white with unilateral 
spectral density N, by (3-45) 

<M>i, '2) = #8(fi - h). (11-5) 

Stochastic signals are sometimes picked up not by a single antenna, but by 
a number of antennas or sensors located at different points of space. Many seis- 
mometers may be distributed over a broad area for the detection of seismic waves 
such as might come from an earthquake or nuclear explosion, and arrays of ultra- 
sonic sensors have been constructed for receiving acoustic signals under water, as in 
sonar. Both seismic and sonar signals can be represented as stochastic processes, 
and techniques for processing the outputs of such arrays can be derived from the 
principles of detection theory. Instead of a single input v(t), there are now a number 
of inputs, the signal and noise components of which are correlated both temporally 
and spatially. The methods to be described here can be extended to handle multiple 
inputs, but with some increase of mathematical complexity. For the application to 
seismology, we refer the reader to the December 1965 issue of the Proceedings of the 
IEEE; for the application to sonar we cite the paper by Middieton and Groginsky 
[Mid65J. Further references are to be found in both. 

The detection of stochastic signals seems to have been treated first by Davis 
[Dav54] and Youla [You54]. The approach through the theory of hypothesis test- 
ing was taken by Middieton [Mid57]. The task of the receiver is viewed as one 
of choosing between two hypotheses about its input v(t) = Re V(t) exp iflt. Under 
hypothesis H V(t) = N(t), where N(t) is the complex envelope of Gaussian nar- 
rowband noise of complex autocovariance function 4> (*i, t 2 ). Under hypothesis H x 
V{t) = S(t) + N(t), where 5(0 is a realization of a narrowband Gaussian process of 
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complex autocovariance function $ s {t\ , 12). The signals and noise being independent, 
the complex autocovariance function of the input v{t) = Re V(t) exp i£lt is 

under hypothesis H\. The input v{t) is observed during an interval (0, T). 
11.1.2 Vector-space Representation 

As we learned in Chapter 1, the best strategy for the receiver is to form the likelihood 
ratio between the joint probability density functions of samples of the input under 
the two hypotheses. The likelihood ratio is compared with a decision level that 
depends on the criterion of choice, Bayes or Neyman-Pearson, that the designer 
has adopted. Before determining the likelihood ratio, however, we shall introduce a 
convenient notation for handling the signals, the noise, their complex autocovariance 
functions, and similar functions of one or two time variables that arise in this study. 
We shall set up a vector-space representation for our input and its constituent signals 
and noise much like that introduced in Sec. 2.1. 

The complex envelope V(t) of the input will be sampled by means of a complete 
set of functions f k {i) that are orthonormal over the observation interval (0, T): 

\ T fk(t)f*(t)dt =h km = 
Jo 

We can then write the input as 

K(0 = £ n/*(0. 0<t<T, (11-7) 

k 

whose complex coefficients V k are defined by 

Vk = \ T fk{t)V{l)dt. (11-8) 
Jo 

[Sums as in (11-7) without indicated limits will be taken to run from k = 1 to 
k = 00,] In this way we set up a correspondence between the temporal function 
V(t), < t < T, and the vector V = (V u V 2 , ... , V ki ... ) of coefficients. In future 
operations with matrices this vector should be considered as a column vector. Its 
transposed conjugate row vector 

v + = (k,* j k 2 *,... ) k;,...) 

corresponds to the complex conjugate function V*(t). 

The scalar product of two functions V(t) and W{t\ represented respectively 
by vectors V and W, is defined as usual by 

V + W = J V£W k = { T V*(t)W(t)dt, (11-9) 
the second equality following from the orthonormality relation (11-6). 



1, k - m, 
0, k ± m. 



(11-6) 
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In like manner we associate with a function m{t, s) the matrix M = \\mj k \\ of 
coefficients defined by 



m > = f f f*(t)m{t,s)Ms)dtds, 
Jo Jo 

and in terms of these matrix elements m jk the function is 



J ) = £ Z m ikfiWk(s). (1 1-10) 

J * 

When the function /m(?, has the Hermitian property 

w(/,j) = m*(M), (11-11) 

the matrix M is Hermitian: 

M = M\ mj k = m kj . 
To a linear operation of the form 

W{t) = f m(t,s)V(s)ds (H-12) 
Jo 

corresponds the linear transformation 

W = MV (11-13) 

of the vector V into the vector W. Similarly, to the matrix product LM corresponds 
the function 

l(t,u)m(u, s) du 

Jo 

of t and s, where l(t , s) corresponds to the matrix L and m(t, s) to the matrix M, as 
in (11-10). All relations of this kind can be demonstrated by (11-6), and we suggest 
that the reader do so. 

To the matrix M 2 corresponds the function 

rT 

m {2) (t,s) = j m(t,u)m(u,s)du, (11-14) 

and higher powers M ; and the corresponding "iterates" m SJ) (t, s) can be defined by 
continuation of this process. Do not confuse m tJl (t, s) with the jth power of the 
function m(t, s). 

If the functions f k (t) are ei gen functions of the operator m(t, s), 

H*/a(0 = f m(t,s)Ms)ds, < t < r, (11-15) 
Jo 

the ik k are the eigenvalues of m(t, s). The matrix M is then diagonal, and the eigen- 
values jx* are its diagonal elements. When m{t,s) is Hermitian as in (11-11), the 
eigenvalues \y k are real and the functions f k (t) orthonormal, as shown in Sec. 2.1.6. 
If furthermore m(t,s) is positive definite, the eigenvalues \x, k are positive, and we 
assume them to have been arranged in descending order: 

|Xi > fJL 2 > •'■ > fAA > ■■■ > 0. 
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Mercer's theorem states that the kernel m{t, s) of (11-15) can be expressed in 
terms of its eigenfunctions as 

ffl(f^) = I^/*{')/*(4 01-16) 

k 

and it is a consequence of (11-15) and (11-10); compare (2-42). The iterates of the 
kernel m{t,s) are similarly 

j) = X m4/*(0A*(J). (11-H) 

k 

For any function g(x) that possesses a power-series expansion 

CKJ 

converging in the neighborhood of the origin, we can define the matrix function 

CO 

g(M) = X a / M ^' (11-18) 
and to it will correspond a function of the time variables t and s: 

CO 

<?„,(/, s) = X fly/w^/, 5), < (/, i-) < T. 

The trace of the matrix M, which is the sum of its diagonal elements or its 
eigenvalues, is given by 

Tr M = V = \ T m (t,i)dt, (11-19) 

k h 

which we obtain by setting s - 1 in (1 1-16) and integrating over the interval (0, T). 
Alternatively, set t = s in (11-10), integrate over (0, T), and use (11-6). 
The function 

D{z) = ["JO + 
k = \ 

is called the Fredholm determinant associated with the kernel m(t,s). it will figure 
prominently in our calculations of the false-alarm and detection probabilities for 
stochastic signals in white noise. We can write it as 

D{z)~ det(I + zM), 

where det stands for the determinant of what is here an infinite matrix, and I is 
the identity matrix. The determinant of a finite matrix equals the product of its 
eigenvalues. Taking logarithms turns the product into a sum, and extending this to 
our infinite matrices we define, for positive-definite m(t, s) and its representative M, 

In det(I + zM) = Trln(I + zM). (11-20) 

The eigenvalues of I + zM are 1 + z\i k , and provided that the complex number z 
lies within a circle of radius iV about the origin— m is the largest eigenvalue of the 
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kernel m(t, *)— the matrix function In (I + :M) is definable by (11-18) because the 
function In (1 + zx) possesses a convergent power series in the neighborhood of the 
origin. We can express (11-20) as 

In det(I + zU) = Tr £(I + uM^M du = Tr £p(a) du (1 1-21) 

in terms of a matrix function 

P(«) = (I + uM)~ l M. (11-22) 

The integration in (11-21) can be thought of as carried out by integrating the series 
expansion of the integrand term by term. 

The matrix function P(w) is the solution of the linear equation 

M, (H-23) 

0<(/,*)<7\ (H-24) 

for the function P(t, s; u) corresponding to the matrix P(a). Here u is a parameter, 
possibly complex. By analytic continuation, we can extend the domains of P(w) and 
P(t, s; u) over the entire complex w-plane. In terms of the eigenfunctions/*(0 of 
the kernel m(t, s), 

and this function has poles at w = u k = -I/jji* along the negative real axis. 
By means of this function P(t, s; u) we can use (11-19) to write (11-21) as 

In D{z) = In det(I + zM) = f f P(t, t\ u) dt du, 

Jo Jo 

and the Fredholm determinant becomes 

D(z) = expjj^ J o P(t, t; u) dt du J. (1 1-26) 

By means of the rules just set forth we shall be able in our subsequent analysis 
to move freely back and forth between the time domain of functions such as V(t) 
and m(t, s) and the vector space in which functions V(t) are represented by column 
vectors V, which are transformed by matrices M related to m(t, s). We assume 
throughout that these transitions are legitimate for the kinds of signals and noise 
we are dealing with, leaving the treatment of exceptional situations to mathematical 
works on linear operators and functional analysis. 

11.1.3 The Likelihood Ratio 

The receiver is to decide between two hypotheses 

V(t) = N(t), (Ho) 
V(t) = S(t) + N(t), (HO 



P(u) + uMP(u) - 
to which corresponds the integral equation 

P(t, s;u) + u [m(t, r)P(r, s; u)dr = m(t, s\ 
Jo 
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about the complex envelope V(t) of its input Re V{t) exp iili. Here N{t) and 5(0 
are circular complex Gaussian random processes with expected values zero and com- 
plex a u toco variance functions 

4>At,s) = ±E[S{!)S*(s)). 

Under hypothesis H\ the complex autocovariance function of the input is 

$](t,s) = $At,s) + $o{t,s). 

The input is observed during an interval (0, 7"). It is sampled as described in 
Sec. U.S. 2 in terms of a set of functions orthonormal over (0, T) to produce a 
vector V ~ (V\, V 2 , ... ) of complex samples defined as in (11-8). To the autoco- 
variance functions 4>o(A s), § s {t, s), and §\{t, s) now correspond infinite Hermitian 
matrices 4>o> and tf>i, respectively. 

Denote by V"" the column vector of the first n of the samples: 

The joint probability density function of the real and imaginary parts of these sam- 
ples has under each hypothesis the circular complex Gaussian form 

Pj(V {n) ) = (217)-" [det c}>)" ) ]- 1 exp^V^^V"), j = 0, 1, (11-28) 

where <j)y" is the n x n autocovariance matrix of the n samples Vk under hypothesis 

^j' = Hjjkl <M = {E{ViVl\ Hj), 1 < (i,k) < n. (11-29) 

The optimum strategy for deciding between hypotheses Hq and H\ on the basis 
of n samples (V\, V2, ■■■ , V n ) compares the likelihood ratio 

A(V( ,„ ) = MX^ 

1 ; Po(V">) (11-30) 
= [det '-']-' exptiV""^*!," 1 - 1 - < t , , ," ,H )V""] 

with a suitable decision level. Passing to the limit n — ► 00, we find that a suffi- 
cient statistic utilizing all the information in the input v{t) = Re V{t) exp /ft/ is the 
logarithmic likelihood ratio 

In A(V) = G ~ iV^V-lndet^i^o') - U - B, (11-31) 

in which the Hermitian matrix 

H = W ~<K' . 01-32) 

is the solution of the matrix equation 

<t» H<|»i = 4>iH<t>o = <J>t - <t>o = 4> s> (11-33) 

and 

B = lndettf + fa^). 
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Under the Neyman-Pearson criterion the decision about the presence or ab- 
sence of the signal can be based on the quadratic functional 

U = iV + HV = \[ T Cv*(l)h(t,s)V(s)dtds (11-35) 
Jo Jo 

of the input. The kernel of this functional is, by (11-33), the solution of the double 
integral equation 

f f 4>o(t, u)h{u, v)$\(v, s) du dv = <K0, s\ < (/, s) < T. (1 1-36) 
Jo Jo 

It can be broken into two integral equations 

<k(M)= f g(t,u)$i(u,s)du, 0<(t,s)<T, (11-37) 
Jo 

g{t>s)= f 4>o(t,r)h(r,s)dr, 0<(t,s)<T, (11-38) 
Jo 

which must be solved successively. If the statistic U exceeds a decision level Uq, the 
receiver decides that a realization of the stochastic process Re S(t) exp iClt is present. 
The decision level is selected to achieve a preassigned false-alarm probability 

0o = Pr(tf > U \ H ). 
11.1.4 Realizations of the Optimum Defector 

The test statistic U in (11-35) can be generated by means of a properly matched 
time-variable or nonstationary linear filter [Pri56]. Because h(t, s) = h*(s, t), the 
statistic can be written 

•T 



U = Re [ V*{t) dt ['hit, s)V(s) ds 
Jo Jo 

= Re f V*{t)W(t)dt, (11-39) 
Jo 



W{t) = f h(t,s)V(s)ds. 
Jo 



For each value of /, the function W(t) is a weighted average of the input that has 
arrived before time t. It can be generated by passing the input Re[K(f) exp iSlt] 
through a time-variable linear filter whose narrowband impulse response is 

fA(M-T), 0< T <,, 

10, 7>t. 

The output of the filter at time t is Re W(t) exp iO,t, where 

W{t) = { C K t (T)V(t - t) </t = f / - T)V(t - t) dT, 
Jo Jo 

which is the same as (11-39). Because this filter is causal, it can in principle be 

realized physically; it operates only on the input v(t) previously received. 

The output of the time-variable filter is multiplied at each instant by the input 

Re V(t) exp iilt, and the high-frequency components of the product are removed by 
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Figure 11-1. Optimum detector of stochastic signals. 

filtering. By writing out these factors in terms of cos Ctt and sin £lt and multiplying 
them, the reader can show that their product is 

Re V(t) <?'"' ■ Re W{t) e lW = \ Re V*(t)W(t) + terms of frequency 2fl. 

The product is integrated by a low-pass filter with impulse response 

k{t) si, < t < T; k(r) s 0, t > T, 

whose output at time T is 

i Re f V*{t)W{t)dt = iU. 
Jo 

The operation of this receiver is illustrated in Fig. 11-1. The required multiplica- 
tion can be most easily accomplished by passing the sum Re[V(t) + W{t)} exp /'O/ 
through a quadratic rectifier, the output of which is 

\V(t) + W{t)\ 2 = \V{t)\ 2 + 2 Re V\()W{t) + \W(t)\ 2 . 

The separately rectified outputs \W(t)\ 2 and \V{t)\ 2 are subtracted to leave the de- 
sired product. 

A second way of generating the test statistic U employs a time-variable linear 
filter followed by a quadratic rectifier and an integrator [Mid60b]. The output of 
the filter has the complex envelope 

X{t) = f m(t,s)V(s)ds, (H-40) 
Jo 

and this output is rectified and integrated to produce 

V - \ [ T \X(t)fdt. (11-41) 
Jo 

In order for this quantity to equal the test statistic as given by (1 1-35), the weighting 
function m(t, s) must satisfy the nonlinear integral equation 

A(f, s) = f m*(w, t)m{u, s) du, < (t, s) < T, (11-42) 
Jo 

where h(t, s) is the solution of the integral equation (11-36). 

This procedure corresponds to decomposing the matrix H of (11-32) into a 
product, H = M + M, and to X(l) corresponds the vector X = MV, whereupon 

rT 

U = iX + X = i \X(t)\ 2 dt 
h 
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as in (11-41). in order for the filter to be realizable, the decomposition must be 
carried out in such a way that m(t, s) a 0, s > /, and then 



X(t) = \'m(t,s)V(s)ds 
Jo 

can be generated simultaneously with the reception of the input v(t) = Re V(t) exp 
iClt. To determine a solution of the nonlinear integral equation (11-42) having this 
causal property is in general most difficult. 

11.1.5 Stationarity over a Long Observation interval 

When both signal and noise are realizations of stationary random processes, their 
autocovariance functions depend on t and s only through t - s, and we write them 
as 

(MA s) -> $ (t - s), <{>,(/, s) — $ s (t - s), 
and <\>i(t, s) is replaced by 

<|>i(r ~s) = 4> s (t -s) + cj) (/ - s). 

If the observation interval (0, T) is much longer than the correlation times of signal 
and noise, we can approximate (l 1-36) by 

ICO <-00 
<f>i(r - u)h(u, v)<b (v - s) du dv = <\> s (t - s), (H-43) 
-co J -co 

and k(t, s) must also be a function only of t - s. We can then solve (l 1-43) by apply- 
ing the convolution theorem for Fourier transforms. Designating these transforms 
by capital letters, we write it as 

*i(to)7/((o)0>o(ca) = #,(«), 

which yields 

*42> (lM 4) 
4>i(w)<I>o(a>) V ' 

[The transforms <i>o(w), #*(«), and <t>i(<a) are, by (3-19), twice the narrowband spec- 
tral densities <£>o(t»), $ s (o>), and <!>i(a>). We adopt this new convention in order to 
eliminate bothersome factors of 2 from our equations.] 

Under these circumstances, the function m{t,s) in (H-42), whose limits are 
now (~co, oo), also depends on t and s through t - s, and in terms of its Fourier 
transform M(to), 

m(t -s) = f M(<o) e 1 *'-') —. 

J-ce. 2lT 

Then we can write (11-42) as |M(w)| 2 = /f(<o), and 

M{») = J "* 1 **, (11-45) 

L*i(a>)a>o(<o)J 

where 7(a)) is a phase that can be chosen in such a way as to make the filter whose 
impulse response is m(ir) physically realizable, whereupon 

X(t)= f m(t -s)V(s)ds. 
Jo 

404 Stochastic. Signals Chap. 11 



The output of this filter is quadratically rectified and integrated over the observation 
interval (0, T) as in (31-41) to produce an approximation to the optimum statistic 
U that is the more accurate the longer the interval (0, T). 

Suppose, for instance, that the complex autocovariance function of the signal 

is 

4>*(t) = P, e~^\ (11-46) 

and that the noise is white, 4>o(t) = ^V8(t). Then the signal has a Lorentz spectral 
density 

and under hypothesis H\ the spectral density of the input is 



(D 2 + |A 2 V CO 2 + jx 2 

with 

p - * + — ■■ 

whereupon, by (11-44), 

#(<o) - 



7V 2 (w 2 + 2 )" 
The transfer function of the causal filter is then 

«(.)= ^ 



N{$ + io))' 
and its impulse response is 



(T) = ^E^ e -^ U{r y 



m - N 

When \xT » !,/«(/— .y) will be a close approximation to the optimum time- variable 
filter m(f , j). 

11.1,6 The Threshold Detector 

When the signal is much weaker than the noise, we can replace <f)i(/, w) in (ll -36) 
by <M/» w )> obtaining the integral equation 

(]>o(/ , M)A fl («, »)4>o(p, A') du dv ~ <{>,(/ , s), <(/,j) < T, (1 1-48) 

. o Jo . 

for the kernel /?q(/ , s) of the threshold statistic 

V% = i f r f K'(/)Ao(/, j)K(j) df A (H-49) 
"Jo Jo 

for detection of the stochastic signal Re S(t) exp i&t. This statistic can be imple- 
mented by the means, outlined in Sec. II. 1.4. When the noise is white, as in (11-5), 
the kernel of the threshold statistic is simply 

A„</,*) = AC 2 4> ( (r, S ). (11-50) 
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When as in Sec. 1 1.1.5 the signal and noise are stationary and the observation 
interval (0, T) is much longer than their correlation times, the threshold statistic can 
be realized by a time-invariant linear filter followed by a quadratic rectifier. The 
transfer function of the filter is 

Me( W ) = ffl e '« (11-51) 

as follows from (11-45) by approximating $](co) by Again the phase 7(01) is 

chosen so that the filter is realizable. This filter is known as the Eckart filter [Eck51], 
[Mac68]. 

The signal can now be thought of as a dense succession of pulses whose spec- 
trum is proportional to 

and that occur at random times and with independently random amplitudes, as in 
(10-41). As in (2-86), the Eckart filter is an approximation to the optimum filter 
for detecting each such pulse in noise of spectral density <J>o(«), the approximation 
assuming that the observation interval is much longer than the duration of the pulses. 
Because they are narrowband signals with random phases and arrive throughout the 
interval (0, T), one integrates the quadratically rectified output of the filter as in 
(1 1-41) to produce the threshold detection statistic U 6 . 

11.1.7 The Radiometer 

If the signal is an echo from a fluctuating point target, its autocovariance function 
4) s (t, s) can be determined from (l 1-4) by inserting »!j(t, /) = iK0S(t - to), and it is 

<M', J) = F{t - toW - s)F*(js - t ), 

where t is the delay to and from the target. Suppose now that the threshold statistic 
in (l 1-49) is to be used in a receiver to detect this target in white noise. It is given 
by the equation 

Ui = ohi\ T \ Tv ^)F{t - T )iK' " s)F*(s - t )V(s) dt ds 

2N X V% (H-52) 



2N 2 



I 00 r CO 
Y\tm-s)Y{s)dtds t 
-CO J— CO 



where Y(t) = F*(t - tq)V{i) can be formed by multiplying the input by a locally 
generated replica of the transmitted signal with the proper time delay, assumed 
known. In the second integral the integrations need to be carried out only over 
intervals during which the signal might arrive and Y(t) ^ 0. 

The realizations we described in Sec. H.l.4 might now be applied to Uq of 
(l 1-52), the function ty(t - s) taking the place of /?(/, s) and Y(t) the place of V(t). 
In particular we might employ the second realization, for which it is necessary to 
solve an equation similar to (H-42). If the range of integration is extended to 
span the entire time axis, -oo < t < oo, the solution becomes a function only of the 
difference of the two arguments, and the equation itself takes the form 



g*(x-t)g(x-s)dx =ty(t~s). 
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Introducing the Fourier transforms 



<?(») = 



g(0 e" iu " dt, ■ -*-(o>) 



^(t)e~ iu,t dt, 



we find by the convolution theorem that |G(w)| 2 = ^(w), and G(o>) can be taken as 

the phase x( w ) being chosen so that a filter of impulse response g(-r) is physically 
realizable, with g(f) = 0, t < 0. The test statistic then becomes 



A device generating the threshold statistic in this approximate way was 
proposed by Price and Green [Pri60] for detecting signals in radar astronomy. They 
called it a radiometer. In practice, they pointed out, it will not be necessary to 
integrate the outputs of the filter or the rectifier over a very long interval in order 
to achieve a good approximation to the threshold detector. 

If the target is deep fluctuating and the autocovariance of the echo signals 
is given as in (11-4), it is merely necessary to construct a parallel bank of these 
radiometers, each matched to the transmitted signal with one of a dense set of delays 
to. The impulse responses g(i) may differ from one filter to another. Price and Green 
[Pri60] termed this more elaborate device a Rake radiometer, for it is reminiscent of a 
similar device used in the detection of signals in a multipath communication system, 
the Rake receiver [Pri58]. 

11.1.8 An Example 

To illustrate the ideas of this section and the next, it is instructive to have before us the 
simple example of a stationary stochastic signal with the exponential autocovariance 
function in (1 1-46), 

^/)-^-/) = P^-^. 

The signal is received in the presence of white Gaussian noise of unilateral spectral 
density N. The modulation M(t), (11-1), is taken as constant, and the signals, which 
are also Gaussian processes, can be generated by passing white noise through a 
narrowband simply resonant circuit of bandwidth jx. They are observed during the 
interval (0, T). This case has been analyzed by Price [Pri56] and others. 

For future use it will be convenient to have the solution of the integral equation 



rT 



Nh(r, t;u) + u 



4>,(r, s)h(s, t\u)ds = Jf$s(r, t), < (r, < T. (1 1-53) 



The kernel of the test statistic is h(t, s) = h{t, s; 1), by (11-36) with 

<j>o(*, s) = Nh{t - s), . <|>](r, j) = NS(t -s) + <*>,(/ - s). 

The integral equation (11-53) takes the form of (2-91) when we identify the kernel 
as 

<Kr - s) - Nb(r - s) +. «<t>,(r - s) t . 
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and the method outlined in Sec. 2.3 can be applied. The Fourier transform 

p OO 

J -oo 

of the kernel is a rational function of frequency: 

*(«) = N + = JV + - jv^±4, p* = „* + ^ 

where <E> s (oj) is the spectral density of the signal. 

The form of the solution of the integral equation is given in (2-100). What 
corresponds to q Q (t) in (2-100) is the solution h Q (t - s) of (11-53) when the limits 
of integration are -oo and +oo instead of and T> and this solution is the Fourier 
transform of 

_ _ 2^ 



r 



h (t) e~ Ia! dt 



whereupon 



N0(<a) N 2 ($ 2 + o>2)' 



$N 2 



The terms of (2-100) with delta functions are now absent because the degrees of the 
numerator and the denominator of <&(«) are equal. Hence the solution has the form 

h(r, t; u) = + Ae* + AT* (11-54) 

where A and B are functions of /. 

To determine the unknown functions Ait) and B(t\ we substitute (1 1-54) into 
(11-53). When we carry out the integration and use the definition of p, we find 
that all the terms in exp pr and exp(-pr) cancel, as does the term on the right side 
of (11-53). We are left only with terms proportional to either exp jxr or exp(-fjur). 
Setting the coefficients of each of these functions separately equal to zero, we obtain 
two simultaneous linear equations for A and B, which are solved in the usual way 
and yield 

A = HJIO ~ mJKP + p.) ^' + (P - M-) e~ &T 
JV 2 M(P + |x) 2 - ( p - ^2 e -pr] - 

= jLP a (fi - M,)t(P + p.) g P(? "- ?) + (P - M-) g- p(7W) j U ' 

/V 2 p[(p + p,) 2 e P7" - o - e -$T] 

Substituting these into (11-54) and treating the regions r < t and r > t sepa- 
rately, we combine terms to derive the solution 

k(r t . u) = MJ.KP + mJ g pr + (P - p.) g -^I[(p + |x) g « r -'> + (p - jjQ g -P(r-D] 

A^PKP + |x) 2 e$ T - (P - p.) 2 e-pr] 

< r < t < T. (11-56) 

The solution for < t <r < T is found by interchanging r and / in this expression. 
The kernel of the detection statistic in (1 1-35) is given by (1 1-56) when one determines 
p from 

p 2 = s + 
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Substituting into (11-39), we get the test statistic 



U- = 



# 2 0[(3 + n) 2 e* T - (0 - u.) 2 e-P 7 "] 



Re 



.0 

In the present example the threshold detector for stochastic signals having the 
autocovariance function in (11-46) is based on the approximation in (11-50) and 
furnishes the statistic 

■T f ( 

. , ds 

o Jo 



f/ e = ^Re 



V*(t)di f e^'- s) V{s)ds 
k 

~ ~ Re f V*{t) dt f e-^V{t - t) dj. 
N- Jo Jo 



If the input is turned on at time t - 0, the term 



e~^V(t - j)d-r 



is proportional to the envelope of the output of a narrowband simply resonant circuit 
of bandwidth u, tuned to the carrier frequency ft. This output is multiplied by the 
input v(t) = Re V(t) exp Hit in the manner described in Sec. 1 1.1.4, and the product 
is integrated over a period of duration T. Such a threshold detection system can be 
made independent of the true signal power <f>,(0, 0), which may not be known in 
advance; the optimum system, on the other hand, depends on the strengths of both 
signals and noise. 

Unfortunately, there seems to be no such simple approximation in the general 
case when the signal-to-noise ratio is large. In the present example we see from 
(11-56) that for large signal-to-noise ratio and long integration time T ($T » 1), 
the dominant term in the kernel h(r, /) is proportional to exp[-0|r - /(]. Herice the 
optimum detection system is nearly the same as the threshold receiver, except that 
the bandwidth of the input filter is (2\lP s /N) }/2 instead of u,. In the next section we 
shall attack the problem of calculating the false-alarm and detection probabilities 
for such receivers. 



11.1.9 Estimator-correlator Interpretation 



In the detection of signals having unknown parameters, the method of maximum 
likelihood involves our pretending that the signal is present, finding the maximum- 
likelihood estimates of those parameters, and then constructing — at least 
conceptually — the optimum or threshold receiver for detecting a signal having pa- 
rameter values equal to those estimates. When the output of this receiver exceeds a 
certain decision level, the system decides that a signal is present. When a stochastic 
signal s{t) = Re S(t) exp iSlt is to be detected, the parameters can be taken as all 
the sample values 

St = \ T ft(i)S(t)dt (11-57) 
Jo 
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of the signal. When the signal is a Gaussian random process, the real and imaginary 
parts of any finite number n of these have a joint probability density function of the 
circular Gaussian form 

z(S) = C, expt-iS^j'S), 

where $ s is the corresponding n X n block of their complex covariance matrix, and 
C\ is a normalization constant. [We omit the superscripts («) used in (11-28)]. The 
conditional probability density function of the real and imaginary parts of n samples 
(V\, V2> .... , V n ) of the input, given the presence of the signal, is 

Pi<y\ S) = C 2 exp[-I(V + - S + )«J>o'(V - S)], 

where is the corresponding n x n block of the covariance matrix of the samples 
of the noise, and Cj a normalization constant. By the same kind of analysis as in 
Sec. 6.1.4, we find— see (6-20)— that the vector S of maximum-likelihood estimators 
of the coefficients St is given by 

§ = <M>7 ! V. (U-58) 

At this point we can let the number n of samples go to infinity so that all the infor- 
mation in the input Re V{t) exp iilt is utilized in forming the maximum-likelihood 
estimator. 

The maximum-likelihood receiver pretends that it is detecting the signal Re 5(0 
exp iflt, corresponding to the vector S, in noise with complex autocovariance func- 
tion <f>o(f, s). It must therefore contain a filter matched to the signal Re Q(t) exp iftt, 
where Q(t) is the solution of the integral equation 

S(0 = J fl <f>o(', u)Q(u) du, 0<t <T, (1 1-59) 

as in (3 : 52). By (11-12) and (11-13) the vector representation of this equation is 
S = <J>o Q, and 

Q = <f>o 1 S = <i>o , <i» J c|>r l V = HV, 
for by (1 1-33) and because of the Hermitian form of the matrix H, 

The output of the matched filter, sampled at the end of the observation interval 
(0, T), is 

J o Q*U)V(t)dt - Q + V = V + HV = 2U (11-60) 

by (11-9), and by (11-35) this is twice our detection statistic U. Thus the optimum 
receiver in effect estimates the complex envelope 5(0 of the signal as though the 
signal were known to be present, and as in (11-60) it "correlates" its input with the 
solution Q(t) of (11-59). This is known as the estimator-correlator interpretation 
of the optimum detector of a Gaussian stochastic signal in Gaussian noise [Kai60]. 
The maximum-likelihood estimator of the signal is, by (11-58), 

S(0= [ g(t,u)V(u)du, 0<t<T, 
Jo 
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where g(t, u) is the solution of the integral equation (11-37). Because the estimator 
utilizes the input Re V(t) exp i£lt over the entire interval (0, T), it is not causal, and 
S(t) and Q(t) cannot be generated until after the observation interval is past. We 
shall see later that when the noise is white, it is possible to base the design of the 
optimum receiver on a causal estimator of the signal, assumed present in the input 
»(/)■ 

11.1.10 The Question of Singularity 

When we studied the detection of a deterministic signal in colored Gaussian noise, 
we found that under certain circumstances the theory predicted that the signal could 
be detected with zero probability of error. This situation, known as the "singular 
case of perfect detection," would arise, for instance, if the spectral density of the 
noise vanished in a region of frequencies where the spectrum of the signal remained 
finite. The same possibility of perfect detection must be considered in dealing with 
stochastic signals. 

Usually one's model of the signal and the noise is at least partly conjectural, 
and its accuracy cannot be completely verified. The model is often one that has 
been simplified to make it mathematically tractable. If upon an analysis based on 
it, the singular case turns up and the signals appear to be perfectly detectable, the 
mode! must be at fault, for nature never permits complete freedom from the chance 
of error. Our treatment of the problem of singularity will necessarily be crude. For 
rigorous proofs the reader must look to the references we shall cite. 

When a finite number n of samples are utilized, as (U-30) shows, 

U„ = iV t,,)+ (4» ( D" H - Vr^JV*' (H-61) 

is a sufficient statistic. A stochastic signal will be perfectly detectable only if the 
probability density functions of that statistic under hypotheses Ho and H\ recede so 
far from each other as n goes to infinity that they no longer overlap. It will then be 
possible to set the decision level Uo at such a point between them that Qo = and 
Qm - l . Because the statistic U„ is a quadratic form in Gaussian random variables, 
however, its probability density function will be finite at all positive values of U„ 
under both hypotheses. The only way by which the probability density functions 
po(U„) and p\{V„) can cease to overlap as n goes to infinity, therefore, is for the 
difference of the expected values 

AC/,, - E{V H \H X )-E{V H \H ) 

to become ever larger with respect to the standard deviations of U„ under the two 
hypotheses. Conversely, the probability density functions will continue to overlap if 
the ratio 

. Var V n 

remains finite as n grows beyond all bounds. (Vary indicates the variance under 
hypothesis H,.) It is the limiting value of this ratio that settles the question of 
singularity. 
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By (11-33) and (11-61) 

hV n = Tr H<fr s = Tr(^<t>o ! 4>,<}>r l ) 

where we have used the rule Tr AB = Tr BA for any two matrices A and B. Here 
<J>] /2 is the square root of the nonnegative-definite matrix <f> ? . It can be found if 
necessary by diagonalizing the matrix 4>, by a unitary transformation, taking the 
square roots of the diagonal elements of the transformed matrix (the eigenvalues 
of <j> 5 ) and performing the inverse unitary transformation. We have dropped the 
superscripts («) for simplicity. 

By the Cauchy-Schwarz inequality (2-83) the Hermitian matrices A = <b] /2 <l>o 
<f>] /2 and B = $l n fc l $\ /2 satisfy the relation 



ITrABl 2 = 



(-1 j=\ i-i y=l 



'=1 y-i 
= Tr AA + Tr BB + . 

Hence 

(AU n f S Tr ^^V^yW/ 2 Tr ^WW/ 2 *?* 1 / 2 
= Tr ^c^o^^o 1 Tr ^T^T 1 = (Var, £/„)(Var U„), 
and the ratio in question is bounded by 



(AC/*) 2 



^fj- < Var, U n = Tr W W- 

If, therefore, Varj U„ stays finite as n goes to infinity, the detection cannot be perfect. 

Root [Roo63] studied the singularity of the detection of stochastic signals in 
terms of the eigenvalues of the matrix <f>o !/2 <f>i <{>()' /2 , by means of which the 
variance of the test statistic under hypothesis H \ can be written 

Van U n = Tr<f>o 1/2 <|) 5 <t>o 1/2 ^o 1/2 <l> s 4>o ,/2 = £ (e k ~ l) 2 , 

Ar=t 

and he showed that if this sum remains finite as n goes to infinity, the detection 
process is liable to error. Pitcher [Pit66] has proved that a sufficient condition for 
this nonsingularity is that the solution h(t 9 s) of the integral equation (1 1-36) exist 
and be continuous in / and s. Further treatments can be found in papers by Hajek 
[Haj62j and Kadota [Kad64], [Kad65j. Slepian [S!e58bJ presented some simple and 
illuminating examples of singular detection. 

It is generally difficult to judge on the basis of the autocovariance functions of 
signal and noise whether these conditions are fulfilled. Matters are somewhat simpler 
when both signal and noise are segments of stationary random processes and their 
autocovariance functions depend on t and s only through t = t - s. Yaglom [Yag63] 
showed that when both have rational spectral densities 4> s (<«>) and <£<>(<«>), detection 
will be imperfect if and only if 



412 



Stochastic Signals Chap. 11 



This will always be the case when the signal has finite power and the noise contains 
a component that is white. 

We can understand the condition in (11 -62) by observing first that the solution 
of (1 1-36) will contain a term that is the inverse Fourier transform of //(to) in (1 1-44), 
along with some delta functions and their derivatives to take care of the end points 
of the interval (0, 3") and, possibly, some exponential functions. When T is large, 
that term will contribute to Van U„ approximately 



Middleton [Mid61] showed that the remaining terms are always finite. If the con- 
dition (11-62) is satisfied, Van V» wiil indeed be finite when the spectral densities 
are rational functions of co, and the detection will entail a probability of error that 
vanishes only when the strength of the signal itself grows beyond all bounds. 

11.2 THE PERFORMANCE OF THE RECEIVER 

11.2.1 The Moment-generating Function of the 
Test Statistic 

Both the optimum detector and the threshold detector of a stochastic signal have 
the same structure. They determine the value of a quadratic functional 



and compare it with a decision level £/&. Here V(t) is the complex envelope of the 
input to the receiver, and /?(/, s) is a kernel whose form depends on which detector 
is adopted. For the optimum detector, /?(/, s) is the solution of the integral equation 
(1 1-36); for the threshold detector it is the solution of (1 1-48). 

The probability of detecting a particular signal is the probability that U exceeds 
Uo when that signal is present. Because what signal might be present is unknown 
when the signals are stochastic, the only meaningful way to measure the effectiveness 
of the receiver is to average that probability over all signals of the ensemble. 

It is useful to know the average probability of detection not only for the en- 
semble of signals for which the detector was designed, but also for ensembles of 
signals of arbitrary average energy. We therefore introduce the hypothesis //?, that a 
signal is present and that it was drawn from an ensemble in which the autocovariance 
function is r\$> s (t, s\ and we shall attempt to calculate the probability density func- 
tion p n (U) of the test statistic under that hypothesis. The complex autocovariance 
function of the process Re V(t) exp i(lt under H n is 





\E[V{t\)V*{t2)\ H-c^ = <|>n(/i, h), 



(11-64) 



The probability of detection is 
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for signals of an arbitrary average energy 

En = -nf 4>sV,t)dt = r\E; (11-65) 
Jo 

for signals of the designed strength E it is Q tis = Q ( ,(l). The false-alarm probability 
is 2o = firf(0). 

The problem of finding the probability density function of a quadratic func- 
tional of a random process was first addressed when Kac and Siegert analyzed the 
filtered output of a quadratic rectifier whose input is Gaussian noise [Kac47]. Let 
the input to the rectifier be Re V{t) exp i£U. Its filtered output at time / is 

U(t) = f k{t-r)\V{r)\ 2 d r> (U-66) 

J— 00 

where A(t) is the impulse response of the filter following the rectifier. If we put 

h(r,s) = fc(f -r)5(r -5) 

into (11-63) and change the limits of integration from and T to -00 and t, we 
obtain (11-66). Our subsequent formulas can similarly be modified to apply to the 
quadratic rectifier by changing the limits of integration in this way. 

Among other treatments of the distribution of a quadratic functional of Gaus- 
sian noise we cite the work of Siegert [Sie57J, Slepian [Sle58a], Grenander, Pollak, 
and Slepian [Gre59], Turin [Tur60b], and Middleton [Mid60a, Ch. 17]. The usual 
procedure is to derive first the characteristic function of U and then, when possible, 
to make a Fourier transformation to obtain the probability density function. This 
is the course we too shall follow, except that we prefer to work with the equivalent 
moment-generating function. As we have seen in Chapter 5, it enables us to calcu- 
late the cumulative distribution of a statistic such as U by Edgeworth's series or by 
saddlepoint integration. 

We replace the statistic in (11-63) with its matrix form 

</ = IV + HV, (11-67) 

where as in Sec. 11.1.2 we represent the complex envelope V{t) by a column vector 
V of its samples and V*{t) by a row vector V + of the complex conjugates of those 
samples. We begin with a restriction to a finite number n of samples, whose joint 
probability density function under hypothesis is 

A,(V (,,) ) = (2Tr)-"[det ^j" 1 expf-iV^JH V<">), 

where <t>*' is the n X n covariance matrix of the samples V"" under hypothesis H^. 

Denoting by V„ the restriction of (11-67) to the same n samples V"", with H"" 
the corresponding n X n block of the matrix H, we express its moment-generating 
function under hypothesis as 

£[exp(-*t/„)| #„] = (2irr[det ^T' 

• [ fexp[-^V (J ' )+ ^' ) - | V 1 ''>-^V ( '' )+ H (/ ' ) V ( ' ,) ] £ / ;, K v ^'^, 
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where cl" V x d" V Y is the volume element in the space of the real and imaginary parts 
of the n samples V"". By (B-10) this integral equals 



E[a&-zU H )\ //„] = [det det{^~\ + rrf H) r' 

= [det(l„ + z+SjW-'V, 

where l„ is the n x n identity matrix. Passing now to the limit n — > po, we find for 
the moment-generating function of the quadratic functional U ~ ^V + HV 

f h (z) = E[e- U: \ = [det(l + -^H)]" 1 

. = cxp[-Tr ln(I + z^H)] 

as in (II -20). By the same kind of analysis as in Sec. 11.1.2, we can express this 
moment-generating function as 



(11-68) 



fUz) = exp - 



LJt, t \ u) dt 



(11-69) 



where the function jU,(/, .v; u) corresponds to the matrix 

u = a + w^Hr'^H 

and is the solution of the integral equation 

rT rTrT 

d> nl (/ , r )h (f , s) dr = LJt , .v ; u ) + « cb-,, (/ , r )A (r , v)!^ (w, .v; u ) dr civ, 
.o .oJo (11-70) 

< (/,.y) < r, 

as in (1 1-24). It would be necessary to solve this integral equation for all values of u 
in order, through (11-69), to determine the moment-generating function h^[z) of U 
for all values of z, and this must then be inverted to obtain the probability density 
function 



/«, 2tf/ 



(11-71) 



where the real quantity c lies to the right of all singularities of h^z). To carry out 
this program would be a formidable task. 



11.2.2 Detectability in White Noises The 
Residue Series 



When the noise is white and the quadratic functional U is the optimum statistic for 
detecting the Gaussian stochastic signal 

.v(0 = Rc S{t) e iih 

in such noise, the problem of calculating the moment-generating function simplifies 
to the point where it is feasible to use A^(r) to compute false-alarm and detection 
probabilities. With white noise the matrix <j>o = Nl is diagonal for any set of or- 
thonormal sampling functions/i-(/). The matrix H figuring in the optimum detection 
statistic is now, by (1 1-33), 

H = N~ ] (N\ + <}>,.)" '<!>,•, (11-72) 
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and with 

<h, = JVI + i|4», 

by (11-64), we find for the moment-generating function in (11-68) 

k^z) = {det[I + zN~\Nl + i\$sWl + ^sT^sT 1 

_ det(I + AT'fo) (11 " 73 > 

det[I + (1 + z)N~^ s + zi\N- 2 $ZY 

We now seek a residue expansion of this function in order to evaluate the probability 
density function in (11-71). 

Let \k be the kth. eigenvalue of the matrix iV~'<J> s , that is, of the integral 
equation 

kkfkO) = <M*. «)/*(") du, OZt <T. (11-74) 
Then the denominator of (11-73) can be written as 

G(z) = J! [1 + (1 + z)\ k + zi\\h 

and the poles of the moment-generating function h^(z) lie at the points where the 
factors of G(z) vanish, that is, at 

2 = 2 < = -dr^b- <»-*> 

In the neighborhood of its kth zero z k> the denominator of (11-73) can be written, 
with some algebra, as 

G(z) * X*(l + tiM fj [1 + K, + z k \ m a + t\\ m )](z - z k ) 

m*k 

As in Sec. 1 1.1.2 we introduce the Fredholm determinant 

Diz) = f[ (1 + \ k z) = det(I + zir% g ), (11-76) 
k=i 

in terms of which G{z) can be written 

w &k -7-7— -r(2 - ^) = - 2*), (11-77) 

where 
and 
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the prime indicating differentiation with respect to the argument of the function. 
The numerator of (11-73) equals D(l). Thus (11-73) can be written 



If we now substitute this into the contour integral (11-71) and complete the 
contour around the left half-plane, we find for the probability density function of 
the statistic U under hypothesis 

R k +2^ +-n\it)z>(i) 



Pv(U) = > — W1 , - Y ,, n , x exp U~ k , 



when we apply the residue theorem and use (11-77). The probability Qd(t\) of de- 
tecting a stochastic signal of relative average strength t\ is then 

with ^ given by (11-75), a k by (11-78), and R k by (11-79). 
The false-alarm probability is obtained by setting i\ = 0, 

and the probability of detecting a signal of design strength (i\ = 1) is 

Qus = QM) = X exp (~^f} (U " 82) 

These residue series can be evaluated once the eigenvalues \ k of (1 1-74) have been 
computed, provided that not too many of them have a significant magnitude. As 
we shall see, when the product of the signal bandwidth W and the observation 
time T is large, the number of significant eigenvalues is on the order of WT, the 
eigenvalues lie close together, and the factors R k , which alternate in sign, become 
difficult to calculate with sufficient accuracy. These series then become inconvenient 
for computation and unreliable, and another method must be sought. 

When the product of the observation time T by the bandwidth of the stochas- 
tic signal, assumed stationary, is much less than I, only a single eigenvalue \\ is 
significant. The associated eigenfunction f)(t) is then approximately constant: 

/,(/)« r- ,/2 , o</<r. 

From (1 1-39) with m(t, s) = N~ ] $ s (t, s) we see that 

where as in (1 1-65) E = E\ is the average energy of the signal for which the detector 
is designed; Sb is the design energy-to-noise ratio. Now both the optimum and the 
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threshold detectors effectively base their decisions on the squared magnitude | V\ 
of a single sample 



of the complex envelope of the input; | V\ j 2 has an exponential distribution under 
both hypotheses. The false-alarm and detection probabilities are given by (1 1-80) 
and (11-81) as 



so that 



e °" exp l — } 



fl^D-fl!* 1 **. 5b = f. (11-83) 
11.2.3 The Threshold Detector in White Noise 

The threshold detection statistic, according to (H-48) and (H-49), is 

in the matrix representation, and when the noise is white, 4>o = JVI, this becomes 

U * = ^A^~ = ^ V *^^> "> F <*> dt du - C 11 - 84 ) 

This statistic can be realized by the methods described in Sec. 11.1.4, with h{t, s) as 
used there replaced by § s (t, s)/N 2 . 

Comparison with (11-67) shows that the matrix H there is now 

H = N- 2 $ s , 

and by (11-68) the moment-generating function of the threshold statistic is 
Aj(z) - [det(I + zir 2 ^ s )]- 1 

k 

under hypothesis H n that a signal with complex autocovariance function y)<& s (t, s) 
is present. 

By the same kind of analysis as in Sec. 11.2.2, one finds that the probability 
of detection attained by the threshold detector is 

fiSOn) = MU 6 > Ui\ ff„) 

k (l+^Wai) 6 *^ (U-85) 

J - 1 / _ ^ 

z * - "T-TTT— a k = 



Ml+^)' * 

The false-alarm probability, which determines the decision level U&, is then 
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The performance of the threshold detector is independent of the input energy-to- 
.noise ratio adopted in specifying it by (11-84). The same difficulties attend the 
computation of its false-alarm and detection probabilities as were noted at the end 
of Sec. 11.2.2. 

The optimum statistic V for detection in white Gaussian noise can be written 

as 



P 

and the threshold statistic U% as 



We expect the threshold statistic to be a good approximation to the optimum one, 
therefore, when all the eigenvalues X* are much less than 1. A crude criterion for 
this can be derived by the following reasoning, in which it is assumed that the signal 
is stationary and has an autocovariance function <$> s (t — s). 
From (11-19) applied to m(t, s) = N~ l <$> s (t - s), we find 



T 



M0)d( = = ~, (11-89) 



_ 7^(0) _ E 

i\> s \u) at 

T 

where E is the average energy in the signal. Similarly, 

5>* = wlo^** dt = w\l jj*^ ~ u){2 dt du > 

where 4>f(', w) is the iterated kernel as in (1 3-17). This can be written 



X X * = ~^ T JT -\u\)\$Au)\ 2 du. (11-90) 



If now about M of the eigenvalues are nearly equal, \ k « X, and the rest are 
negligible, 

MX * ^, MX 2 « ijj (T - |«|)|<M«)I 2 du; 
and we find for the approximate number of significant eigenvalues 

M~- r r2| ^°>' 2 , (U-91) 

and 

K= Jv- (n - 92) 

When the width of the autocovariance function ^(t) is much greater than T, we 
find M « 1 and X « \\ « E/N as before. When, on the other hand, its width is 
much less than T, 

CI+.(T)PrfT W1 ' 



Sec. 11.2 The Performance of the Receiver 



419 



where W is a signal bandwidth defined by 

The threshold approximation can be expected to be valid, therefore, when X = 
E/(NWT) « I. 

By using the definition in (4-66), the reader can easily show that the deflection 
of the threshold statistic is 

Thus when WT » 1, the threshold detector may attain a respectable probability of 
detection while nevertheless the criterion E/(NWT) <c 1 for its validity is satisfied. 

11.2.4 Application of Saddlepoint Integration 

When the product WT of the signal bandwidth W and the observation time T is 
large, the probability of detection by either the optimum or the threshold statistic 
is most accurately computed by the method of saddlepoint integration described in 
Sec. 5.2. According to (5-19) and (5-20), 

1 - (Wn) = J exp U z ~r, (1 1-94) 

Qd(t\) = - f z~%(z) exp U Q z (1 1-95) 

where C is a path that passes through the saddlepoint Z(f of the integrand lying to 
the right of the origin and C+ is a path through the saddlepoint zo to the left of 
the origin. One uses (11-94) for detection probabilities greater than about 0.5 and 
(11-95) for those less than about 0.5. 

In order to put the moment-generating function k^{z) from (11-73) into a form 
in which these integrals can be evaluated numerically, we introduce the functions 

a(z) = I{1 + z + {(1 + zf - 4*)zJ 1/2 }, 
ftz) = \{\ + Z - [(1 + z) 2 - 4<nz) !/2 }. 

For these 

ot(z) + 0(z) = ! + z, a(z)0(z) = <nz, 
and from (1 1-73) and (1 1 -76) we find 

w=i>(1) ni +( i + zL + ^ 
= g<i> 

Z)(a(z))/)0(z))* 
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If the Fredholm determinant is known in analytic form, therefore, we can use (1 1-96) 
in (1 1-94) and (1 1-95) and evaluate the integrals numerically as described in Sec. 5.2. 
It has been found efficient to integrate along a parabolic path as in (5-36) [Hel83]. 
For the threshold detector one similarly defines 

a'{z) = \[z + {z 2 -4-nz) 1/2 ], (11-97) 
$(z)^\[z-{z 2 -4^ 2 }, (11-98) 

whereupon the moment-generating function A®(z) of the threshold statistic U% is, by 
(11-84), 

*fc> e Dwwwvv (11 - 99) 

When the eigenvalues \k of (11-74) are known, one can approximate the Fredholm 
determinant of (11-76) by taking a finite number of factors and stopping the mul- 
tiplication when the eigenvalue \ k is sufficiently small. Otherwise one must seek 
an alternative method of computing the Fredholm determinant appearing in (11-96). 
and (11 -99) at complex values of z along the path of integration. 

If we identify N~ l $ s (t t s) with the function m(t, s) that figures in Sec. 11.1.2, 
the eigenvalues u^ appearing there are identical with the eigenvalues \ k in (11-74), 
and the function P(t, s; u) of (1 1-24) becomes 

P((, s; u) ~ Nh(t, s; u) 

in terms of the solution of (1 1-53), whereupon by (1 1-26) the Fredholm determinant 
is given by 

D(z) = expj^jvj rfwj A(r, j. (11-100) 

If one can solve (1 1-53) analytically, therefore, this equation permits calculating the 
Fredholm determinant and thence the moment-generating function of the detection 
statistic. A simpler formula will be derived in Sec. 11.3. 

1 1.2.5 The Toeplitz Approximation 

When the stochastic signal is a segment of a stationary process observed in the 
presence of white noise during an interval (0, T), (1 1-53) becomes 

f r 1 
Nh(r,t;u) + u 4>,(r - s)h(s, t; u) ds = — fo(r - /), 

Jo /V (1 1-lUi ) 

< (r, /) < r, 

where $ s (t) is the complex autocovariance function of the signal. If the product 
WT of the observation time T and a suitable measure W of the bandwidth of the 
signal is large, we can approximate (11-101) by extending the limits of integration 
to (—oo, oo), whereupon it is apparent that the function h(r, t\ u) depends on r and 
t only through their difference: 

h(t, >•; u) « h<x>(r — t\u). 
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We can then solve (11-101) approximately by Fourier transformation, much as in 
(2-77) through (2-79), and setting 



ftoo(r - r; w) = 

we find 



2tt 



/v7/ M (a>; u) + w<J>,(a))tf M ((o; u) = 7V _, $,(w) 

in terms of the spectral density 4>,(a)) of the complex envelope of the signal, that is, 
the Fourier transform of <Mt). Hence 

Nm<f *' U) " N[N + u*,(»r 
and by (11-100) the Fredholm determinant is approximately 

D(z) « D*(z) = exp^7VT|"A«(0; «) j 

= e X pfrflr-^) £1 (U-,02) 
L Jo J-» iV + u$ s (<a) 2ir J - 

= exp[rjjn[l + z/T 1 *,^)] ^ j. 

If in this limit WT » 1 one samples all temporal functions at instants uni- 
formly spaced over (0, T), the resulting matrices <f>^- and H become Toeplitz matrices 
in the sense defined in connection with (10-27); their elements depend only on the 
differences of their indices, and each row is identical to the row above it, but shifted 
one place to the right. For this reason (11-102) is called the Toeplitz approximation 
[Gre59]. 

11.2.6 Rational Spectral Densities 

The complex autocovariance function of the stationary signal is subject to the Her- 
mitian condition 4>*(t) = 4>,(-t). If its spectral density is a real rational function of 
the frequency w, we can apply the methods of Appendix A to evaluating the Toeplitz 
approximation. If (J> v (w) has In simple poles, <|>,.(t) must have the form (A-3), 

4>sC0 = exp(-^T), t > 0, 

k T (1M03) 
= exp^-T, t < 0, 

A = l 

for n complex constants jx A . The term /o6(t) in (A-3) is missing because the signal 
must have finite average power. 

The associated spectral density is as given in (A-4), 



__ v fkx^kx + /*;•(&> ~ M* r ) 
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for real values of eo, and when this is cleared of fractions, it becomes 
in n 

$,.(co) - c f] i/u - iy? n - n*r 2 

/=! k=\ 

_ r n;L,[( W -M 2 + 4i . _.. + .. 

- C — = — . h; = /l,v + lh; r . 

nil r/ 1? , 2 1 ' .' ' A J> ' 



(11-104) 



with C a positive constant, and with m < n in order that the signal process have 
finite average power. As in Appendix A we take the jjl^ 1 < k < n, to have positive 
real parts. This spectral density has simple poles at the n points — /jx* in the lower 
half of the w-plane and at the n points /jx| in the upper half. 

The argument of the logarithm in (11-102) is then also a rational function of 
frequency and can be written 

1 + zN- x & s {u) = fl (H-105) 

where the (5/(2) are the In roots of the algebraic equation obtained by clearing the 
equation 

1 + zN" l s (~ip) = (11-106) 

of fractions; we have put jXA-t-,, = — 1 < k < n. 

The function in (11-105) can vanish for real values of w only for z real and 
negative. As the point z moves from the origin along any path that does not cross 
the negative real axis in the 2 -plane, therefore, the trajectories of the roots |5,(z) 
cannot cross the imaginary axis. Provided that z does not He on the negative real 
axis, n of the (5/ will have positive real parts and n will have negative real parts. We 
index the roots so that Pi, ... , p„, m, ... , \l„ have positive real parts and so that 
Pii+i, • ■■ , P2/1, m+i, ■■■ , jJL2yi have negative real parts. 

With this indexing convention, as can be shown from [Gra65, eq. 4.222(1), 
p. 525], the result of substituting (1 1-105) into (11-102) and integrating is theToeplitz 
approximation 



D(z) a Z) M (z) " exp 



7 £ (Py - M ■ (11-107) 
L ;=i 

An example of this result will appear in Appendix H. 

One uses A»(z) to determine the moment-generating functions h-^{z) and h®(z) 
of the optimum and threshold detectors for WT » 1 by (1 1-96) and (1 1-99), respec- 
tively. It will in general still be necessary to integrate (1 1-94) and (1 1-95) numerically 
along a suitable path, as described in Sec. 5.2, in order to calculate even the approx- 
imate false-alarm and detection probabilities by this method. 

1 7.2.7 Detectabffity of Signals with Lorentz or 
Rectangular Spectral Densities 

Two types of signal spectral densities that manifest opposite extremes of frequency 
dependence are the Lorentz, 
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as in (11-47), and the rectangular, 



u 




-irW < w < ttW, 



(11-108) 



Here P s = E/T is the average power in the signal. 

The Lorentz spectral density has very long tails, and the signal has a substantial 
fraction of its power at large deviations from the carrier frequency; the rectangular 
cuts off sharply at each edge of its spectral band. We shall compare the detectabil- 
ities of stochastic signals with these spectral densities when either the optimum or 
the threshold detector is utilized. It will be assumed that both have equal effective 
bandwidths W as defined by (11-93). For the Lorentz spectral density, the auto- 
covariance function is given in (11-46), and from (11-93) we find that its effective 
bandwidth is W = jx. For the rectangular spectral density, substitution of (11-108) 
into (11-93) shows that W is its effective bandwidth in that same sense. 

Figures 11-2 and 11-3 refer to signals having a Lorentz spectral density. In 
Fig. 11-2 we have plotted the probability Q d {\) = that the optimum detector 
correctly decides that a signal of the standard strength for which it was designed is 
present versus the square root D of the input energy-to-noise ratio E/N for that 
standard signal. The curves are indexed with the value of m - \y,T, and each point 
on a curve represents a different detector. In Fig. 11-3 we exhibit as a function of 
m = \lT the average energy-to-noise ratio E/N in decibels required to attain three 
values of the probability of detecting the standard signal. The solid curves refer 
to the optimum detector, the dashed curves to the threshold detector of Sec. 1 1 . 1 .6. 

The computations needed to produce these two figures were rather lengthy, 
and the details have been relegated to Appendix H. The residue series in (1 1-80) and 
(1 1-85) were used for values of m = less than about 7. Appendix H provides a 
closed expression for the residue factor Rk in (11-79), circumventing the computa- 
tion of infinite products. For 7 < m < 12 saddlepoint integration as in Sec. 11.2.4 
was employed. For m > 12, as shown in Appendix H, one can apply the Toeplitz 
approximation of Sec. 11.2.5 and obtain approximate, but accurate false-alarm and 
detection probabilities in closed form. 

For m « 1 and for m » 1 the performance of the threshold detector ap- 
proaches that of the optimum detector. The closer the probability of detection 
is to 1, the more slowly that approach takes place and the greater the disparity 
between the two detectors. 

When the stochastic signal has the rectangular spectral density in (11-108), its 
complex autocovariance function is 




values Xk of the integral equation (1 1-74), which can now be written as 
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D = (E/Ny 



Figure 11-2. Probability Q t i s of detection of a Gaussian stochastic signal with a 
Lorentz spectral density (11-47) versus the square root D of the average energy-to- 
noise ratio E/N. Curves are indexed with the value of uT; Qq = IG~ ( \ 

NW J-t/2 ir{t - ii) 
and by changing the integration variable we can write this as 



NW J_i tt(jc - y) 



< x < I . 



(11-10?) 



Slepian and Sonnenblick [Sle65b] tabulated the eigenvalues \J' of the integral equa- 
tion 

-1 sin c(x — y) 



•Fk(y)dy, 



-1 < A' < 1, 



and comparing this with (1 1-109) we see that the eigenvalues we have been using are 
related to theirs by 

P. i.i F. 

(11-110) 



^ _ P s v(-v» _ E As) 



NW * NWT 
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Figure 11-3. Lorentz spectral density: Energy-to-noise ratio (dB) required to at- 
tain a probability of detection for the standard signal as a function of the 
time-bandwidth product m - uT; Q = 10 -5 . Solid curves: optimum detector; 
dashed curves: threshold detector. Curves are indexed with the values of Q ds . 

for c - \-kWT. For a moderately large time-bandwidth product WT » I, there are 
roughly WT nearly equal eigenvalues 

E 



< k < WT, 



NWT * 

and the rest are negligible [SIe65a]. 

The curves of the detection probability Q ds versus the square root D of the 
energy-to-noise ratio E/N for the rectangular spectral density look much the same as 
those for the Lorentz spectral density plotted in Fig. 1 1-2. The differences between 
the performances of the optimum and threshold detectors are most clearly evident 
by comparing Fig. 11-3 with 1 1-4, which plots the energy-to-noise ratio E/N needed 
to attain detection probabilities Q ds = 0.9, 0.99, and 0.999 with both detectors when 
the spectral density is rectangular. The abscissa is the time-bandwidth product WT, 
which as we have seen corresponds to the parameter jx7 for the Lorentz spectra) 
density. The disparity between the optimum and the threshold detectors is much 
smaller for the rectangular than for the Lorentz spectral density. 

In calculating these detection probabilities and the false-alarm probabilities 
needed for setting the decision level C/ , we used the residue series (1 1-80) and (1 1-85) 
for values of WT less than about 7, computing the coefficients R k by (11-79) car- 
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Figure 11-4. Rectangular spectral density: Energy-to-noise ratio (dB) required 
to attain probability Q& of detection for the standard signal as a function of the 
time-bandwidth product m ~ WT\ Qq = 10"" 6 . Solid curves: optimum detector; 
dashed curves: threshold detector. Curves are indexed with the values of Qj s . 



ried to a finite number of factors. For larger values of WT, however, the largest 
eigenvalues He too close together to permit accurate computation of the Rk's from 
the calculated eigenvalues It was then necessary to determine Q and Q d , by 
numerical integration of (11-94) and (11-95). The Fredholm determinant D(z) fig- 
uring in the moment- generating functions could be computed accurately enough 
along the parabolic path of integration by evaluating the product in (11-76) with the 
eigenvalues \* determined from the tabulated eigenvalues \* as in (11-110). 

When WT » 1 , the near equality of the first WT eigenvalues k k and the 
insignificance of the rest permit us to approximate the Fredholm determinant by 

D(z) ~ (1 +\z)"\ * = ^' m^WT. 

As one can see by substituting the rectangular spectral density from (11-108) into 
(11-102), this corresponds to the Toeplitz approximation. 

For integral values of m = WT the false-alarm and detection probabilities are 
given approximately by scaled cumulative chi-squared distributions. Indeed, our 
detection statistic is approximately 
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by (1 1-87), for the matrix 4>* can now be approximated as an m x m diagonal matrix 
with diagonal elements X. Furthermore, when m ~ WT » 1 is so large that we can 
neglect all but the first m eigenvalues and take these as equal, the threshold statistic 
(1 1-88) becomes 

x 

k=\ 

and the optimum and threshold detection statistics U and U Q differ only by an 
insignificant constant factor. We expect, therefore, that for WT » 1 the optimum 
and the threshold detectors will have nearly the same performance when the spectral 
density of the signals is rectangular. 
The sum 

m 

I>l 2 

k=l 

has a scaled chi-squared distribution under both hypotheses H and H\ t and the 
false-alarm and detection probabilities for both detectors are approximately 

Qo * qtxo), Q ds * ?(*,), *i = 

1 H" X 

with 

q(x) = e~ x Y, -J, (11-111) 

as in (4-31). Here x = £/ (l + X)/X. For nonintegral values of m ~ WT it is 
necessary to compute Qo and by numerical contour integration of (11-94) and 
(11-95), substituting the approximate forms 

^"[ttwtT)]- ^".otW- (U - U2) 

Indeed, when /« = WT is very large, even though an integer, numerical contour 
integration is more accurate than summing the series in (11-111). 

Mismatch between the design signal strength and the strength of the signal actu- 
ally present only slightly affects the performance of the detector based on the statistic 
U = |V + HV with H given by (11-72), provided that the design signal strength is 
equal to or greater than that of the signal actually present. The sensitivity of the 
performance to the choice of the standard signal strength is greater for the Lorentz 
than for the rectangular spectral density. This behavior is brought out in Figs. 11-5 
and 11-6. In each we plot the input energy-to-noise ratio S = v\E/N required to 
attain a certain probability Q d of detection as a function of the energy-to-noise ratio 
So = E/N for which the optimum detector is designed via (1 1-87). The minimum of 
each curve corresponds to t| = 1. The value S = represents the threshold detector. 

Crude approximations to the false-alarm and detection probabilities for signals 
with spectral densities other than the rectangular can be computed by using the 
approximate moment-generating functions in (1 1-1 12). The values of X and m = WT 
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Figure 11-5. Input energy-to-noise ratio 
S = t]E/N to yield detection probabil- 
ity for <2o - 10" 4 versus the design 
energy-to-noise ratio So = E/N; Lorentz 
spectral density, u,T = 14/tt = 4.4563. 
Curves are indexed with the value of 
Q,/. [Reprinted from C. W. Helstrom, 
"Evaluating the detectability of Gaus- 
sian stochastic signals by steepest-descent 
integration," IEEE Transactions on Aero- 
space & Electronic Systems, AES-19 
(May 1983), 428-37, © 1983 IEEE.} 

can be calculated from (11-91) and (11-92). We might call these "equal-eigenvalue" 
approximations. When m is an integer— and it usually suffices to take it as an 
integer—, it is simply a matter of summing the series in (11-111). The more the 
form of the spectral density departs from the rectangular, the less accurate this type 
of approximation is. 



11.3 CAUSAL ESTIMATOR-CORRELATOR REPRESENTATION 

11.3.1 Causal Estimator of the Stochastic Signal 

In this section we shall derive a simpler way of calculating the Fredholm determinant 
D(z), defined as in (1 1 -76). As a useful by-product of this analysis we shall formulate 
the optimum detector of a Gaussian stochastic signal in white noise in such a way 
as to bring out its resemblance to that for a deterministic narrowband signal. In 
Sec. 11. 4 this detector will be described in terms of a representation of the signal as 
the output of a linear system driven by white noise, and from a state-space model of 
that system equations permitting the numerical computation of £>(z) and a fortiori 
of the false-alarm and detection probabilities can be deduced. 
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Figure 11-6. Input energy-to-noise ratio 
S = t\E/N to yield detection probabil- 
ity Qd for Qo = 10" 4 versus the design 
energy-to-noise ratio So = rectan- 
gular spectral density, ffT = 14/ir = 
4.4563. Curves are indexed with the 
value of Q d . [Reprinted from C. W. 
Helstrom, "Evaluating the detectabil- 
ity of Gaussian stochastic signals by 

] f | i steepest-descent integration," IEEE 

50 100 150 200 250 300 ^f] Transactions on Aerospace & Electronic 
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Let us consider estimating the values of the complex envelope S(t) of the 
signal during the subinterval (0, t) within the observation interval (0, T). The signal 
is known to be present in the midst of white noise of unilateral spectral density 
N. For the sake of future calculations, however, we shall assume that the complex 
autocovariance function of the signal is not $ s (t, u), but z$ s (t, w), as though the 
power of the signal were z times as large. We designate the complex envelope of this 
signal by S s (t). We seek the maximum-a-posteriori-probability (MAP) estimator of 
the complex envelope S : (t) at time t on the basis of the input v(t) = Re V(t)expiflt 
during the preceding interval (0, t). 

We can again formulate our problem and carry out its analysis in terms of 
vectors and matrices as in Sec. 11.1, but the functions f k (t) defining their elements 
are now orthonormal over the interval (0, t). From (1 1-58) we find that the vector 
representing the estimator of the complex envelope S : (t) is 

S- =MV, M = z<M>7' ) 4>i = Nl + z$ s , (11-113) 

for the covariance matrix of samples of the noise is now ej> = Nl. Because the noise 
is white and <J> = Nl, <f>] and <(>,. commute, M = z<t>7'<t>.v, and the matrix M is the 
solution of 

<i>iM = (Nl + z<j>,)M = z4»,. (11-114) 
This matrix equation is equivalent to the integral equation 

Nm{t, u; z; t) + z f <$> s (t, r)m{r, u; z; t) dr = z<j> 5 (/, w), 

JO (11-115) 

< (/, U ) < T. 
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The parameter t in m{l , u: z; t) reminds us that this function is defined over (0, t) 
and is the solution of an integral equation (11-1(5) over that interval. 

The estimator of the complex envelope S : {t) of the signal at time / in (0, t) is 
then r 

S : (l ) = f /??(/ , it: z ; t) V{u )du, < / < t. (11-116) 
Jo 

In particular, the MAP estimator of 5 ; (t) is 

S- (t) = f ih(t, u; r ; t) K(w) rfu . ( 1 ! - 1 1 7) 

Jo 

This is a causa! estimator of the signal at time t because it utilizes only the complex 
envelope V(t) of the past input during (0, t). 

1 1.3*2 Recalculation of the Fredholm Determinant 

The Fredholm determinant D(z; t) can be defined for an arbitrary interval (0, t) as 
in (! I -76), provided that the eigenvalues X/,- = Xa-(t) arc those of the integral equation 
(I l -74) with the upper limit T replaced by t. Denoting the logarithm of D(z; t) by 
£(-;t), we find from (ll-lOO) 



E{=; t) .= In D(z; t) = N du /?(/, /; //; t) dt, (l l-l 1 8) 

. o Jo 

where by (l 1-53) the function /?(/% t; u\ t) is the solution of the integral equation 

Nh(i\ /; u; t) + u \ <|> v (r, s)h(s, /; v; t) rf.v = J\r'<|» y (j\ /), 

Jo ' (I l-l 19) 

< 0\ /)<T. 

We shall set up a differential equation for £U;t) as a function oft, and for 
that wc first need an integral equation for the partial derivative of h{r, t; «; t) with 
respect to t. Following [Sie57] and designating that partial derivative by a subscript t. 

S~h(r, t: u: t) = // T (r, /; (/; t), 
wc differentiate (11-119) to obtain the integral equation 

Nh r {r , t ; u ; t) + u f § s {r , .y)/i t (s , r ; w ; t) <fe = -« <k (r , t)/? (t,/; «;t). (11-1 20) 
Jo 

The solution of this integral equation is 

A T (r, /; u;t) = ~Nuh(r,r, u; t)/i(t, /; «; t), (11-121) 
as we can see by substituting it into (1 1-120) and using (11-119) with / replaced by t: 



-N 2 uh{r, t; «; t)/j(t, /; a; t) Mr 2 



^.vf'", •S')^C V » T v w ; t)/j(t, r; w; t) rf.v 

= — i/<|?. v (r, t)/?(t, /; w; t). 
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If we now differentiate (11-118) with respect to t, we find 

dE{2) T) _ 



9t 



N [ h{t, t; u; t) du + N f du f h 7 (t, t; a; t) dt 
Jo Jo Jo 

= ^«^(t, t; «; t) - Nu^\{t, t; h; t)/?(t, /; k; t) A j (1 1-122) 

= N gfr, t; «) rf« 
Jo 

in terms of the function 

g(r, t; u) = h(r, t; u; t) - Nu [ k(r, v; u; r)h(v, t\ u\ t) dv. (1 1-123) 

Jo 

We now seek an expression for the last integral in (11-122). As we are through 
with differentiating with respect to t, we can return to the matrix domain, where the 
manipulations are less cumbersome. 

Denoting by H(«) the matrix corresponding to the solution of (1 1-1 19), we can 
express that equation in matrix form as 

(Nl + u$ s )H(u) = N~ ] 4> st 

so that 

H(u) = N~ l (Nl + u+jr 1 *!. (1 1-124) 

and the matrix G(u) corresponding to the function defined in (11-123) is 

G(w) = H(u) - NuH(u)H(u) = H(u)\l - NuH(u)] 
= H(k)[I - u(Nl + 

= NH(u)(Nl + = (ATI + u$ s y 2 4> s - 

Multiplying by du and integrating over (0, z), we obtain 

G(u) du = N~ l l - (Nl + z4> s )- ] = zN-\Nl + z+ t )~ l $ a ^ j 

= zH(z) 

by (11-124). Returning to the time domain, we express (11-125) as 

g(r, t; u)du = zk(r t t; z; t), 



f 

Jo 



r 

Jo 



to 

and (11-122) becomes 



Integrating over (0, T), we find 



- Afeft(iyT; z; t). 



£(*; T) = In Z)(z; T) = Nz C h(i, r, 2; t) <*t. (1 1-126) 

Jo 

This is Siegerfs formula for the logarithm of the Fredholm determinant D(z) in 
terms of the solution of (11-119), with u replaced by 2. In contrast to (11-100) it 
does not require integration over the parameter u. 
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When the signal to be detected has the Lorcntz spectral density (11 -47); 'we 
find from (1 J -56), replacing r, t, and T all by t, and replacing it by z, 

■ ■ 2fLP s (3 + u.) e* + (p - p.) e-P* 

A(t, t; z; t) 



TV 2 (3 + ft) 2 - (3 - y$ e-Pr 



= ±1- ln[(p + (J-) 3 - {p - u.) 2 ff -<P + ^], (H-I27) 

Substituting this into (11-126), we obtain 

In D{z; T) = ln[(3 + u.)V^ )T - (3 - ^V^ 7 "] - ln(4Bp.) 
whereupon the Fredholm determinant becomes 

If we compare (1 1-115) and (1 1-119), we see that 

m(t,u;z;T) = Nzh{t,u\z\i), (11-129) 
and Siegert's formula (11-126) can be written as 

E{z;T) = \nD(z) = f m(T, t;z;t)^t (11-130) 



in terms of the solution of the integral equation (1 1-115). This solution is related to 
the kernel of the causal estimator of a signal having complex autocovariance function 
-«>,(/,«). 



f 1.3.3 The Mean-square Estimation Error 

The total mean-square error in the estimate S : (t) of the complex envelope S : (t) of 
the signal s : (t) ~ Re S~{t) exp iflt can be defined as 

s T = \e\ |5 = (t) - S = (t)| 2 dT. 
" Jo 

The expectation E is taken under hypothesis H\ that the signal $ = (t) is present. We 
shall demonstrate that this total error equals N In D(z), where D(z) is the Fredholm 
determinant. 

By the principle of orthogonality, which we introduced in Sec. 6.1. 4, the error 
S : {r) - 5 = (t) is orthogonal to the data V(t), < t < t, on which the estimator is 
based through (11-1 17). (When dealing with circular complex random variables, the 
transpose used in Sec. 6.1.4 is replaced by the Hermitian conjugate.) It is therefore 
orthogonal to any linear combination of those data, and in particular to S z (t) itself: 

^{[5,(t)-S-(t)]S;(t)}-0. 
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(11-132) 



The instantaneous mean-square error is thus, with (1 1-1 17), 

£ (t) = i/r{[S = (T) - 4(t)][S/(t) - S:(t)]} 
= i£{[5 2 (T)-^(T)]S;(T)J 

= z<J> 3 (t, t) - \E f m( T , w; z; t)K(m)5.*(t) 
Jo 

Because the signal and the noise are statistically independent, 

\E[V(t)S;(j)] = ^[SAOSHt)] = zMt,*), 
and the instantaneous mean-square error is 

e(T) = 2^(t, t) - z \ m(i, u; z; i)$ s (u, t) du. (11-131) 
Jo 

To the matrix equation 

■ M4>, = NM + zMifr, = zfa 
from (11-113) corresponds the integral equation 

Nm(r, t; z; t) + z [ m(r, u; z; t)^> s (u, t) du = z$ s (r, t), 
Jo 

^ (r, < t; 
compare (1 1-115). Putting r = / = t, we obtain 

7Vm(T, t; z; t) + z f m(<r, «; z; T)(f> 5 ( W , t) du = ztf^T, t), 
Jo 

whence (11-131) becomes 

e(r) = A^(t,t;2;t). 
The total mean-square error is thus 

*r = ^ T \S z (r)-S 2 (T)\ 2 dT 

It (H-133) 

= N\ /m(t, t; z; t) (/t = JV'ln D(z) 
Jo 

by (11-130). 

11.3.4 The Fredholm Determinant for a Rational 
Spectral Density 

We shall now use (11-130) to calculate the Fredholm determinant D(z) for detection 
of a stationary Gaussian stochastic signal having a rational spectral density *j(a>) of 
the form in (11-104). The noise is still white with unilateral spectral density N. The 
result of our labors will be an expression for D(z) as a quotient of two determinants. 
When computing false-alarm and detection probabilities by saddlepoint integration 
of (11-94) and (11-95), these can be evaluated by standard computer routines for 
evaluating determinants and solving algebraic equations. 
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The integrand of (11-130) can be obtained from the solution of (11-1 15) with 
u = t. As the signal is stationary, that integral equation becomes 

W,(r) + 2 f <k(f - r)M~{r) dr = z4> s (t - t), < ; < t, (11-134) 
Jo 

with M(0 = m((,T;z;T), Then by (11-130) 

\nD(z) = f M.(t)^t. (11-135) 
Jo 

As in (2-98) and (2-99), M z (t) must have the form 

2n 

M(0 = 5>,-(T)expp,f, (11-136) 

in which the ^ are the 2« roots of the characteristic equation arising from (11-134), 
N + z^sHp) = 0, p = p 1; fc, ... , fc„ ; (11-137) 

here <£>j(w) is the rational spectral density of the signal. Like (11-105), this becomes 
an algebraic equation of degree 2n when we substitute the spectral density from 
(11-104) and clear fractions. Again, when z does not lie on the negative real axis, 
there are n roots 0j, ... , fJ„ in the right half of the p-plane and n roots 0„+t, ... , 02n 
in the left half-plane. When z goes to 0, these roots go into the In poles jtj, ... , ^ 
of the spectral density ip); M*+ n = -p*, 1 < k < n. 

When (11-103) and (11-136) are substituted into (11-134) and the integrations 
are carried out, we obtain — much as in Appendix A— the equation 

2,1 f jl r ( »(fo-i j +> T i , ~n 

A=l 

The first term vanishes because the 0/s are the roots of (11-137). Equating terms 
proportional to exp j*,* / on each side of what is left, we obtain 

Y c.-| = e~^\ \<k <n, (11-138) 

and equating terms proportional to exp(~jx|r), we find 

V c 5 =o, n + \ <k <2n. (11-139) 

p y _ ^ 

Denote by G' the matrix whose elements are 



^4" ~ 



ft ^ 1<;<2», (1 1-140) 

! , n + 1 < < 2n. 

0/ - M* 



Sec. 11.3 Causal Estimator-Correlator Representation 



435 



Then from (11-138) and (11-139) 
and by (11-136) 

A&W = X X e pjT (G'~% e-» T . (1 1-141) 

Now consider, in view of (1 1-20), 

± >n de, G'(t) = ^Tr in G'( T ) = Tr = | £ (G")^. 

/=! /r=l 

By (11-140) 

f 1 < < 

1 0, n + 1 < k < 2n, 

whereupon by (11-141) 



M(t) = ~ In det G'(t). 
Putting this into (11-135), we find that the Fredholm determinant is 

det GOT < U " 142 > 



det G(0) ' 

(V— I J 

where the matrix G(r) is defined by 



G kJ {T) = 



1 < k < «, 



\<j<2n, (11-143) 

, n + \ <k <2n, 

Pj ~ V* 

as in (A- 15), and G(0) is obtained by putting T = 0. 
The eigenvalues X* of the integral equation 

**/*</) = ji^M ~ o < * < r, 

are K k = -z* 1 , where by (11-76) D(z k ) = 0, and with (11-142) this leads to (A-22). 

When T becomes much larger than the reciprocal bandwidth W~ x of the signal, 
those factors exp fyT in the upper half of the determinant det G(T) with 1 <j <n 
dominate because Re > for the first n 0/s. After factoring those exponential 
factors out of the first n columns of G{T\ we see that D(z) is proportional to D M {z) 
as given in (11-107). The ratio D(z)/D M (z) = 1 at z = 0, and when WT » 1 it is 
nearly equal to 1 along that part of the path of integration contributing significantly 
to (11-94) and (11-95). 
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By the same argument as in Appendix A, we can see that if the spectral density 
of the signal has a pole |x of order p, both the upper and lower halves of both 
determinants G{T) and G(0) will contain p - 1 rows of which each is the derivative 
with respect to |x of the row above it. In this way the existence of multiple poles of 
the spectral density 4> a (o>) of the signal can be accommodated. 

In order to understand the behavior of the roots of (11-137) as the complex 
parameter z varies, it is instructive to consider the root locus of the equation 

N + z<b s (<o) = (11-144) 

in the complex to-plane. The root locus of (1 1-137) in the complex p-plane is obtained 
by rotation counterclockwise through 90°. The root locus of (11-144) possesses 2n 
branches, and if we think of the complex parameter z as starting out from the origin 
z - of the 2 -plane, we perceive that each of those 2n branches originates at one of 
the 2n poles of the spectral density 3>,(<o). These poles occur in complex-conjugate 
pairs, and /|4 = -i\x/ i+ll . The 2m zeros of 0,(0)) either occur in complex- 

conjugate pairs or, if real, possess even multiplicity. [If a real zero of ^,.(o>) had odd 
multiplicity, 0,.(to) would become negative for a real value of to in its neighborhood.] 
Of the 2n branches of the root locus, 2(n - m) go off to infinity and 2m enter the 
2m zeros of <f>,(to) when \z \ — ► oo. Of these 2m zeros, in lie in the lower half of the 
w-plane and m in the upper half-plane, unless there are some zeros on the Re to-axis. 
When a branch enters one of these real-axis zeros from the lower half-plane, a mirror 
image of that branch must enter it from the upper half-plane, for the real-axis zeros 
have even multiplicity. Thus as \z | — > co ? m f the 2n branches of the root locus 
must finish in the lower half of the w-plane and m must finish in the upper half. 

The 2(n - m) branches of the root locus heading toward infinity approach 
asymptotes at angles 

(2k + lVir + arg z 

arg a, - ^ s , < k < 2(/7 - m). 

2{n - m) 

For < arg z < it, half the asymptotes, which are spaced uniformly in angle, will lie 
in the lower half-plane and half will lie in the upper half-plane. When the to-plane is 
rotated counterclockwise to become the jW-plane, there will be n branches of the root 
locus in the right half of that plane and n in the left half. The roots % traversing the 
former set of branches are assigned indices from 1 to n\ those traversing the latter 
set receive indices from n + 1 to 2n. 

When z is real and sufficiently negative, at least two of the roots (3a will be 
purely imaginary. One of each pair of these purely imaginary roots is assigned 
an index k between 1 and n, the other an index greater by n\ it does not matter 
which is which. These purely imaginary roots of (1 1-137) arise when calculating the 
zeros z k = -l/A* of the Fredholm determinant in (11-76) and also, possibly, when 
locating a saddlepoint in Re z < 0. 

11.3.5 The Likelihood Functional 

Our aim is^now to express the optimum detection statistic V in terms of the causal 
estimator Si(t) given by (1 1-117); z = 1. If our detector possessed the input v(t) = 
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Re V{t) exp iflt only during the interval (0, t), it would optimally base its decision 
at time t on the statistic 

"T f T 



U M = if \ r V*(mr>nl;'r)V(t)drdt 
Jo Jo 

= Re f V(r) f h{r,t; 1; t)V '(() dt dr 
Jo Jo 



(11-145) 



by (11-35), (11-53), and (11-39). We shall demonstrate that 

U = U(T) = I Re J^rw^T) dT - ^LJVkt)! 2 rfr. (11-146) 

In this analysis we follow [Sch65]. The form (11-146) that results is called the 
causal estimator-correlator representation of the optimum statistic U for detecting 
the stochastic signal Re S(t) exp /fl; in white noise. 

Differentiating (11-145) with respect to t, we obtain 

dU C T P C T 

rfr" = 3^*( t )J^A(t,'; UfW(t)dt + c.c. + ij J o K'(r)A T (r,f; l;i)V(t) dr dt, 

where c.c. stands for complex conjugate. Now using (11-129), (1 1-117), and (1 1-121), 
we can write this as 

dU 1 - NTH 

— = - Hfi[r (t)Si(t)] - J K*(r)A(r, t; 1; t)A(t, r; 1; T )K(r) ^ rfr 

= iRfi[r(T)5l(T)]-i|Si(T)| 2 . 

Integrating this differential equation over (0, T), we obtain (11-146). 

According to (11-31) the likelihood functional for the detection of a narrow- 
band stochastic signal s(t) = Re S(t) exp iflt in an input v(i) = Re V(t) exp /flf 
containing white noise is 



(11-147) 



A[v(t)] = exp(t/ - B) 

= exp[l Re J o S,W(0 dt - ^j" |S,(0I 2 rfr - J? j 

by (11-146), where by (11-34) 

B - In det(I + N~ l $ s ) = In D(l; T) = E(l; T). 

If we compare this likelihood functional with that in (3-54) as rewritten for detection 
of a known signal Re S(t) exp iClt in white noise, 

A[v(t)} = ex P [~ Re ^ S*{t)V(t) dt - W)l 2 rfr], 

we see that the two are much alike except that the complex envelope S(t) of the signal 
is replaced by its causal MAP estimator S\(t) and except that (11-147) contains an 
extraneous term B. 

Kailath [Kai69] showed that for the detection of low-pass stochastic signals s(t) 
in white noise n(t), what corresponds to that term B can be eliminated by altering the 
definition of the integrals in (11-147) and replacing the ordinary Riemann integral 
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that we have implicitly been utilizing up to now with the ltd integral. By doing so, 
he was able to broaden the scope of the result; the stochastic signal s(t) need not 
be a sample of a Gaussian random process, but can have any probability measure, 
provided that by S](t) one understands that causal estimator of s{t) minimizing the 
mean-square error E[s\(t) - s(t)] 2 . According to Sec. 6.1.2, this is the conditional 
expected value 

j,(0 - E[s(t)\v(t% < /' < /, //,], 
v{t) = s(t) + n(t). 

The likelihood functional for detecting this stochastic signal in white noise is written 



A[v(t)] = exp 



T 1 f T 

h(t)v(t)dt - — 
o l N Jo 



(Oi 2 dt 



(11-148) 



in which J indicates the ltd integral [Kai69]. For a Gaussian stochastic signal the 
estimator is the linear MAP estimator defined as in (11-1 17). For signals with other 
probability measures it is unknown how to calculate the required conditional expected 
value s\{t). The considerations underlying this reformulation of the likelihood func- 
tional involve the properties of broadband white noise treated as the derivative of 
the Wiener-Levy or Brownian stochastic process. In this chapter we are dealing 
with narrowband signals received' in narrowband white noise, that is, noise whose 
spectral density is flat over a range of frequencies much wider than that occupied 
by the signals, but not extending beyond a fraction of the carrier frequency fi of 
the signals. A formulation of the likelihood functional of the kind in (11-148) is 
inappropriate, and referring the interested reader to [Kai69], we forbear undertaking 
its derivation. 

11.3.6 The Innovation Process 



Let us denote by m(t , .v) the function 

_, , [ m{t, .v; 1; /), < s < f < T, 
m(t, s) = < ' 

(0, < / < s < T. 

Then the causal estimator of the envelope S(f ) of the signal is 



rT 



Si(t) = 



m(t,s)V(s)ds, 



o < / < r. 



(11-149) 



(11-150) 



by (11-117) with z = 1, and the integration can be taken over (0, T) by virtue of 
the second part of (11-149). Substituting this into (11-146), wc write the optimum 
detection statistic U as 



1 rT 
U = -Re 



rT 

'0 Jo 



V*{u)m{u,s)V{s)du ds 

rT rT r T 



_1_ 

2N 



o Jo 



V*(u)m*{t t u)m(t, s)V(.s) dt du ds 



rT 



rT 



V\l)h{l,s)V(s)dt ds, 
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where h(t, s) is the kernel of (11-35). Because this equation must hold for arbitrary 
V(t), we find the relationship 

Nh(t,s) = m(t,s;l;T) = m(t,s) + m*(s,t)-\ m*{u, t)m(u t s) du, 

Jo (11-151) 

< (*, s) < T. 

If we denote by m the matrix corresponding to the function m(t, s) and by 
NH = M that corresponding to m(t, s; 1; T), (11-151) becomes 

M = m + m + - m + m 

or 

I-M = I- m-m + + m + fn = (I - m + )(I - m). (11-152) 

The right side is a factorization of the matrix I - M that resembles the decomposition 
of a matrix into upper-triangular and lower-triangular factors. Because for t > s 
m*(s, t) = 0, (11-151) is equivalent to the nonlinear integral equation 

m(/, s; 1; T) = m(t, s) - j* m*(u, i)m{u, s)du, < s < t < T. (11-153) 
For signals having a Lorentz spectral density, by (11-56), 

m( , \-T\=»*J. [Q + + (P - M^JKP + l^ P(7W) + (P - u>-P( r -"j 

l ' ' ' ' iVp (P + |jl)2 e pr - (p - 

P 2 = M. 2 + < s < / < T, 

and 

2u,i\ (8 + a) + (B - u.) e~^ s 

m(u > s) = -7T g^P^-g-^^ ^"-^ 01 - 154) 

The reader might amuse himself by showing that these functions indeed satisfy 
(11-153). 

The random process 

N'{t) = K(0-Si(/) 

is called the innovation process because it represents the new information at time / 
about the signal and the noise that is not embodied in the conditional expected value 

$i(r) = £[5(01 V{t>), < f' < f, Hi\ = \'m(t, u)V(u) du. 

Jo 

Kailath [Kai68] showed that this process N'(t) is white and possesses the same uni- 
lateral spectral density as the input noise N(t). In vector form 

N' = V - Si = (I - m)V, 

and the covariance matrix of the elements of the vector N' is 

# n = i£(N'N /+ | H\) = i(I - m)£(VV + | H } )(\ - m + ) 
= (I-m)(M + <f> s )a-m + ). 

Now by (11-32), 

I - M = I - JVH = I - tf(AHi - $-1) = N(Nl + <M _I , 
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whereupon by (11-152) 

<}>;, = N(l - m)(I - M) _1 (I - m + ) 

= N(l - m)(I - mT\l ~ M + r'(I - m + ) = Nl, 
or in the time domain 

±E[N'(t)N'*(s)\ Hi] = Nb(t - s), 
and the innovation process N'(t) is white. 

11.4 THE STATE-SPACE FORMULATION 

1 1.4.1 Generation of the Signal by a Linear System 

When the narrowband spectral density $, t (w) of the signal s(t) = Re S(t) exp tilt is 
a rational function of w, the signal can be thought of as having been generated by a 
time-invariant linear system whose input is white noise. The output of the system — 
that is, the signal envelope — is then the solution of a linear differential equation of 
finite order with constant coefficients. This differential equation can be decomposed 
into a set of n first-order linear differential equations for a set of random processes 
XkO), which are organized as a column vector having n elements: 

.v,(0 



*(') = 



A' 2 (0 



The integer n is one-haif the degree of the denominator of the spectral density O s (co) 
as a. polynomial in oj. The vector x(t) is called the state vector of the linear system, 
and the set of first-order differential equations, or state equations, can be concisely 
written as 

—p- = Fx(0 + Gw(t), (H-155) 

in which F is an n X n matrix, G is a column vector of n elements, and w(t) is 
the circular complex white noise driving the system. The n components Xj(t) are 
considered to be circular complex Gaussian random processes. The elements of F 
and G may be complex. In what follows we refer to the complex envelope S(t) as 
simply "the signal." 

The white noise Re w(t) exp iClt is statistically independent of the white noise 
Re N(t) exp /ft/ in which the signal Re S{t) exp iClt is to be detected. Its complex 
autoco variance function is 

i£[iv((,)w*(f 3 )] - G6(f| - t 2 ), (H-156) 

and its spectral density Q is proportional to the average power P s of the stochastic 
signal 5(0- That signal, as a linear combination of the n processes Xk(t), is given by 

S(t) = Cx(/), (11-157) 
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in which C is a row vector of n elements. A signal generated from white noise in 
this way might be termed leucogenic. 

This state-space model of the generation of the stochastic signal by white noise 
can be generalized by allowing the matrix F, the vectors G and C, and the spectral 
density Q to be functions of the time t. The signal S(t) is then a nonstationary 
random process. The outcome of this formulation is a set of nonlinear differential 
equations from whose solution one can calculate the Fredholm determinant D(z) 
needed for computing the moment-generating functions of the detection statistic 
and thence the false-alarm and detection probabilities of the stochastic signal. From 
related equations the causal estimate of the signal and the optimum detection 
statistic U can be computed in real time. Given an arbitrary autocovariance function 
<M'» s ) f° r a nonstationary signal, it is in general difficult to find a state-space 
model whose output closely approximates the signal process S(t). Methods for doing 
so are to be found in [Bag71, pp. 292-309], in [Lju83], and in other books on 
the topic of system identification. In our examples we shall restrict ourselves to 
stationary stochastic signals with rational spectral densities, taking F, G, C, and Q 
to be constant. 

The first step in the procedure is to set up the state-space model by determining 
the matrix F and the vectors G and C. It is best illustrated for stationary leucogenic 
signals by an example, in which the spectral density of the signal will be taken as 

= . 2 , 2w g ^ (11-158) 
(to 2 + a 2 )[(tn — b) 2 + c 2 ] 

and n = 2. A certain arbitrariness is involved, and we can choose the vector C as 
we like. Let us take C = (1 0) so that the signal is the first component of the state 
vector. Factoring $,((!>), we see that the signal can be generated by passing white 
noise w(t) through a filter whose narrowband transfer function is 



(a + i(o)(c - ib + /w)' 
*sM = Q\ K(w)| 2 . Then the output *i(/) satisfies the differential equation 

[i + -][l + c -*]*<<'> = H. W . 

A set of state equations is most easily derived by defining 

, . dx\ , 
~ IF + ~ 

so that 

— + ax 2 = w(t). 
Thus the equations (11-155) take the form 

— = -(c - ib)x\ + x 2 , 
~ ax 2 + w(t), 
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and we find 



The solution of the state equations (I [-155) in terms of the input w(t) and the 
initial conditions on x(f) at ( = to can be formally specified in terms of the matrix 
impulse response K(/, t) of the system: 

x(0 = K(t, * )x(( ) + f K(f, t)G(t)w{t) <*t. (11-159) 

The n xn impulse response matrix satisfies the differential equation 

^K(/,t) = F(OK(f,<r), f>T, (11-160) 

which must be solved with the initial conditions K(t, t) = I, with I the nXn identity 
matrix. For a time-invariant linear system, 

K(/,t) = e^'~ r) U(t -t); 

the exponential function of a matrix is defined in terms of the Taylor series, 

e t{ - I + Ff + \&t 2 + •■■ + ~¥ k t k + ••■ . 
2 k\ 

For later use we need the n X n covariance matrix of the state vector x(t), 
defined by 

L(/) = l£[x(0x + (0], (11-161) 

in which 

X + (0 = (X|*(0,Jf2(0,.-,«0) 

is the row vector formed by transposing the state vector and taking the complex 
conjugates of the elements. Differentiating (11-161) and using the state equations 
(1 1-155), we obtain 

f = i £[ i(V(,) + *(o*-(o] (1M62) 

= i£(Fxx + + Gwx* + xx + F + + xn>*G + ), 
the dot standing for a time derivative. Now by (1 1-159) and (1 1-156), 

' JO 

= q\ K(/,t)G(t)S(t-/) rfr 
Jo 

= |fiK(/,/)G(/) = ifiG(/), 

the factor of .5 entering because the delta function 5(t - t) stands at the end of the 
interval of integration with only half its "mass" inside. Similarly 
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and substituting into (11-162) we obtain the differential equation 

~ = FL + LF + + 0GG + (11-163) 

for the covariance matrix L(t) of the state vector. 

When the input noise is stationary and the linear system is in steady state, L 
is a constant matrix Loo and satisfies the steady-state covariance equation 

FU + L co F + + QGG + = 0. (11-164) 

If at time / = 0, L(0) = U, the output of the system will be a stationary random 
process. By starting with other initial covariance matrices L(0), nonstationary out- 
puts with a variety of autocovariance functions can be generated. 

For ii > t 2 the covariance matrix of the state vector is, by (1 1-161) and (1 1-159) 
with /] = t, t 2 = to, 

<M'i> t 2 ) ~ ^E[x(h)x + (t 2 )} = lK(f ls t 2 )E[x(t 2 )x + (t 2 )] 
= KOi, t 2 )Uh\ t] >t 2 , 
for the white noise w(j) for t > t 2 is uncorrelated with x(z 2 ). Similarly 

$ x (ti,h) = L(/i)K + (/ 2 , /i), t\<h. 
The autocovariance function of the signal itself is 

<bs{t, u) = C%('> ")C + (1 1-166) 

by (11-157). 

For a stationary process with L constant, the covariance matrix of the state 
vector is 

e FT L«„ t > 0, 
IwT F+T , T<0, 

provided that the linear system is stable, so that stationarity can be attained. For 
stability the eigenvalues of F must have negative real parts. The spectral density 
matrix of the state vector is the Fourier transform 



(11-165) 



4> x (t)= l 2 E[x{t + t)x + 0)] = ! 



0((o) = (/wl - Fr l L„, - U(/o>I + FY'. (11-167) 

The spectral density of the signal is then 

<^(u>) = OP(<o)C + . (11-168) 

For our example (11-158) the reader should show by substitution into (1 1-164) 
that the steady-state covariance matrix is 



Lw 2a 



c +a 1 



c\c + a — ib\ 2 c + a — ib 

1 i 



(11-169) 



c + a + ib 

and verify that (11-167) and (1 1-168) lead back to the spectral density in (11-158). 
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1 %M.Z The Kalman-Bucy Equations 

We turn now to determining the causal estimator of the signal S : (t), whose complex 
auto'covariance function is z <}>>-(/, «), in terms of the state-space model of the linear 
system that generates S(t). It will then be possible to set up a linear system that 
generates the optimum statistic U for detecting the signal s(t) = Re S(t) exp iQ,t 
from the input v(t) = Re V(t) exp itit as it arrives, and we shall be able to compute 
the Fredholm determinant in terms of the solution of a set of nonlinear differential 
equations known as the Kalman-Bucy equations. These equations play a central 
role in the prediction and filtering of random processes. 
The signal S : (?) is given by 

S-M = Cx r (/), (11-170) 

where the w-vector x = (f) obeys the state equations (11-155), except that the strength 
Q of the driving white noise is replaced by zQ. 

Because both the signal S s (i) and the noise N(t) are circular complex Gaussian 
random processes, the maximum-a-posteriori-probability (MAP) estimator of this 
signal at time t = t is the causal linear estimator that minimizes the mean-square 
error 

\E\SM ~ S-(t)| 2 , 
and it is the conditional expected value 

S : (t) ~ £[S 3 (t)1 V{t\ < t < t, Hi] 

of the signal, given the input V(t) ~ S : (t) + N(t) to the receiver during the interval 
(0, t). By (11-170) 

S-(t) = C£[x-(t)| V{t), < / < t, Hi] 
= Cx : (t), 

where x ; (t) is the minimum-mean-square-error estimator of the state vector x-(t). 
The latter estimator will be a linear functional 

x : (t) = \\ z {T,t)V(t)dt (11-171) 
Jo 

of the input V(t) during the interval (0, t); here k ; (T, is a column vector with n 
components. 

The estimator S-(t) is given by (11-117), whose kernel m(i, «; z\ t) is the so- 
lution of the integral equation (11-132) with r = t, 

jVm(T, t; z; t) + z \ w(t, w; z; T)<j> v («, ?) <sfw = z<J> v ( T > t\ 

Jo ' (11-172). 

< t < T, 

and we can therefore write 

m(T,. / ; z ; t) = C(T)k : (t, 0, (11-173) 
where k-(T, obeys the vector integral equation 

z.<Mt, /)C + (?) = JVk-(r, + z f k-(T, m)4>,(m, /) A/. (1 1-174) 

Jo 
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Multiplying this equation on the left by the row vector C(t) and using (1 1-166) yield 
(11-172). 

In Appendix I we show from (11-174) that the column vector k-(r, /) obeys 
the vector differential equation 

= F(T)k-(T, - k-(T, T)C(T)k-(T, 0, < / < T. (J M 75 ) 

Differentiating (11-171) with respect to t and using (11-175), we find that the esti- 
mator x_-(0 of the state vector obeys the differential equation 
dx 

-~ = F4 = (t) + k r (T, t)[K(t) - Cx,(t)3. (11-176) 

Because nothing is known about the state vector \ : {t) at time / = 0, the best esti- 
mators of its components are zero at / = 0, and the initial condition on (1 1-176) is 
x,(0) = 0. 

The vector differential equation (11-175) must be solved with the initial con- 
dition k : (t, t) at t = t. Both the equation itself and its initial condition depend on 
this vector function k_-(T, t). It is given by 

Mt.t) = JV-'P-COC+Cr) - (11-177) 

in terms of the n X n matrix function 

P_-(t) = z4> a (t, t)-z[ k,(T, w)C(u)<M", t) du. (11-178) 
Jo 

If we multiply this integral equation by N~ ] and, on the right, by the column vector 
C + (t) and recall (11-166), we obtain (11-174) with / = t. 

When the parameter z is real and positive, the matrix P,-(t) is the covariance 
matrix of the error x : (t) - x-(t) in the estimator of the state vector x : (t): 

P = (t) = ] j£{[x = (t) - x_-(t)][x!(t) - *+( T )]}. 

By the principle of orthogonality introduced in Sec. 6.1.4, the error vector x = (t) - 
x_-(t) is orthogonal to the data V(t) and hence to the estimator x-(t), which is a 
linear functional of the data: 

^{[x,(t)-x z (t)]x : + (t)} = 0. 

Therefore, by (11-171), 

= z<Mt, t) - \E f k-(T, u)V(u)xt(T) du, 

and if we substitute V(u) = N(u) + Cx-(w) and remember that the noise N(t) is 
independent of the noise w(t) driving the linear system and hence of the state vector 
x = (t), we obtain (11-178). 

The integral equation (1 1-178) is of no use to us at this point because it contains 
the unknown vector function k z (?, u). In Appendix I we derive from it the nonlinear 
matrix differential equation 

^ - FP, + P-F* + zQGG + - AT I P : C + CP = . (U-179) 
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This must be solved with the initial condition P : (0) = - «!>.v(0, 0) = rL(0), which 
follows from ( 1 1-178) with t = 0. Because of its form it is called a Riccaii equation. 
Once we have solved it, we have from (1 1-177) the vector k-(T, t) needed in (1 1-175). 

Equations (11-175) and (11-179) arc the Ka!man-Rucy equations [Kal61] for 
the miniimim-mcan-square-error estimator (i 1-171) of the state vector x-(t), given 
an input of the form 

V{t) = N(i) + Cx : (t) 

during the interval < / < t. The vector k_-(T, t) is called the Kahvcm gain. Most 
derivations of these equations are statistical in nature, requiring the assumption that 
the parameter z is rca! and positive so that S : (f) can be considered a true random 
process with a real average power proportional to z. Here we regard these equations 
as purely a mathematical device for solving the integral equation (1 1-172) when the 
underlying signal S{t ) can be considered as generated by a linear system described by 
the state equations (i 1-155) and driven by circular complex white noise w{t). In this 
way we are free to lake z as a complex variable, as in evaluating saddlepoint integrals. 

From the solution of these equations we can calculate the Fredholm determi- 
nant D{z\ for by (1 1-173) and (11-177) 

m(T, t;:;t) = C(T)k : (T. t) = AT ! C(t)P_-(t)C + (t) (11-180) 

and by (1 1-130) 

lnO(:)=i f C(t)P„-(t)C + (t)Jt. (11-181) 
N Jo 

This equation holds even for complex values of z and can be used, for instance, to 
calculate the moment-generating function h n (z) on a path of integration in (11-94) 
or (11-95) when computing false-alarm and detection probabilities for Gaussian 
stochastic signals as in Sec. 11.2.4. 

The Riecati equation (11-179) is a set of n 2 coupled nonlinear differential 
equations of first order, and they must in most cases be solved numerically. They 
can be replaced by twice as many linear equations, embodied in the pair of linear 
matrix, differential equations 

d! 

with initial conditions R](0) = rL(O) and R^(0) = 1. From their solutions the error- 
cova riance matrix is 

P : (/) = R|(/)[R 2 (/)r l - 
as one can show by differentiating this and using (1 1-182) and (1 1-183), 

(It (It ' ' (It ~ 

= IT- + zOGG + \ ~ P_-(/V -'C + CP, - F*I), 
which is (i i-179). Furthermore, 

P,(0) - R,(0) = rL(0>. 



= FR, + zQGG + R 1 , (i 1-182) 

= A'"'C + CR t - F + R 2 . (11-183) 
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Now (11-181) becomes 



In D(z) = ^ T ( CRjRJ'C** = TW+CRiRJ 1 ) dt 
w Jo ^ Jo 

by the rule for traces, Tr AB = Tr BA; as CRiRj'C* is a 1 x 1 matrix, it is equal to 
its trace. Substituting for 7V _1 C + CRi from (11-183), we find 

In D{z) = J^Tr + RJ 1 dt 

= TrlnR 2 (r)+ {^TrF^)* 
Jo 

= In det R 2 (F) + f Tr F + (/) dt, 
Jo 

and the Fredholm determinant is 

D(z) = det R2(7-) exp^Tr F + (f) dt J. (11-184) 

Thus D{z) can be computed directly from the solution of the set (11-182) and 
(11-183) of linear differential equations without any subsequent integration over 
(0, T). This result is due to Collins [C0I68J. For a time-variable linear system this 
solution would usually have to be carried out numerically. 

As a simple example, suppose that the signal has the Lorentz spectral density 
(11-47). Then in the state equations (11-155) we can take F = -jjl, G = I, Q = 
2yJP s » and C = 1, all these now being scalars. The signal is S(t) ~ x\(t), which is 
the single state variable. The steady-state variance equation (11-164) is now 

2LF = -QG 2 = -2fx/>„ 

whereupon L - P s . Now (11-182) and (11-183) become a pair of first-order linear 
differential equations: 

= -Mi + zQ*i, (H-185) 

^ = N- l R { + uJ? 2 , (11-186) 
at 

with the initial conditions R\(0) = zP s , R 2 (0) ~ 1. Their solution will have the form 

Ri(t) = A exp 0,/ + 5 exp fa, (H-187) 
i? 2 (0 - C exp Pif + Z) exp 2 /. 

When we substitute these into (1 1-185) and (1 1-186) and separately equate the coef- 
ficients of exp 0i/ and exp 2 / on both sides, we find 

(01 + il)A = zQC, (0 2 + v)B = zQD, 
(0, - il)C = N~ l A, (0 2 - u.)Z> = N~ l B, 
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and these are consistent only if 

-r t- >=^ 

We can therefore set p t = p and p 2 = ~P, with p given by (1 1-327). 
The initial conditions yield the equations 

A + B = ^j(O) = ?,z, 
C + D ~ R 2 (0) = 1. 
Solving these for the coefficients ^, B, C, and D, we obtain 



Substituting these into (11-187) and dividing, we find for the error covariance func- 
tion 



(P + fi)^' + (P - V)e 



*1 



(p + u,)^' - <P - u.)V 

By (11-180) and (11-129) this is Nm(t,t;z;t) = N 2 zh(t,t;z) in agreement with 
(11-127). Substituting our solution for R 2 (i) into (11-184), on the other hand, we 
obtain the Fredholm determinant (1 1-128) directly. 

A particular example with n = 2 has been worked out in detail by Kerr [Ker89]. 

ft .4-3 The Schweppe Likelihood-ratio Receiver 

The optimum statistic V for detecting the Gaussian stochastic signal Re 5(0 exp iflt 
in white noise has been shown to be realizable as in (11-146). When the complex 
envelope S(t) can be represented as the output 5(0 = Cx(f) of a linear system 
described as in (11-155) by a set of state equations, 'the causal estimator needed in 
(11-146) is 

= Cx(0, 

where by (13-176) the components of the estimator x(0 of the state vector satisfy 
the set of differential equations 

— = Fx(0 + ki(/ f t)[V{t) - $,(/)]. (11-188) 

The Kaiman gain vector k](/, /) is determined in advance from the solution of the 
Riccati equation (11-179) with z - 1: 

k,(/,0 = /V-'P,(/)C + (0. 

The operation of the receiver is shown schematically in Fig. 11-7. The input 
V(t) is turned on at time t = 0, at which time all state variables are zero. From the 
block marked l\ the components of the estimate x(0 emerge continually thereafter. 
(The double lines in the diagram indicate conduction of n vector components.) From 
that estimate the block "C" generates the causal estimator of the signal by multiplying 
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Figure 11-7. Schweppe likelihood-ratio receiver. 



the vector x(t) by the row vector C. It is subtracted from the input V(t) at point 
D\, and at the multiplier M\ the difference V(t) - Si(t) multiplies each component 
of the Kalman gain vector ki(/> /). To each of the components of the product the 
components of the output Fx(0 of the box "F" are added at point A\. That box 
carries out the multiplication of the vector tit) by the n X n matrix F(*). 

These operations could be carried out at the carrier frequency ft of the input 
or, more conveniently, at some lower intermediate frequency down to which the 
input Re V(t) exp i&t has been beaten. The integrator 1\ can then be a narrowband 
simply resonant circuit, resonant at the carrier or intermediate frequency and having 
a bandwidth much smaller than the bandwidth occupied by the input. The elements 
of the Kalman gain kt(/, /) are low-pass functions of time. 

At the multiplier M the components of double-frequency 2ft are filtered off, 
so that its output is the low-pass signal 

[Re V{t) e /fi '][Re S { (t) e iSlr ] — Re Sf(t)V(t). 

From this the output j&COl 2 of the rectifier is subtracted at Z> 2 , and their difference 
is integrated at h. At time t = T the output of this low-pass integrator equals the 
optimum statistic U. The box marked "Alarm" compares this statistic with the 
decision level Uo and reports the presence of a signal if U > U G . As the formulation 
of the optimum detector in (1 1-146) and its generation by way of the estimator x(/) 
of a state vector were proposed by Schweppe [Sch65], this system has been termed 
the Schweppe likelihood-ratio receiver. 
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f 1.S DETECTION OF A COHERENT SIGNAL AND A 
STOCHASTIC SIGNAL 



11*5.1 The Optimum Detector 



When a narrowband signal can pass to the receiver both directly and via a scattering 
medium such as the ionosphere, the receiver is confronted with the choice between 
two hypotheses 

V(0 = N(t)> (H ) 
V(t) = N(t) + S(t) + R(t) t (HO 

about the complex envelope of its input v(t) = Re V(t) exp iClt during an observa- 
tion interval < / < T. As before in this chapter, S(t) is a realization of a circular 
complex Gaussian random process with complex autocovariance function 4> s (t, u). 
The signal Re R(t) exp iClt is deterministic; it is called the specular component of 
the input. We assume that its complex envelope is completely known even unto 
its r-f phase; the receiver is synchronized with the transmitter. As in most of our 
work here, we again assume that the noise Re N(t) exp /ftf is white with unilateral 
spectral density N. 

The optimum detector is most directly derived by expressing the input and its 
constituents as Karhunen-Loeve expansions in terms of the eigenfunctions/ fr (f) of 
the autocovariance function $ s (t, u), defined as in (H-74): 

V(0 = X V k f k {t\ S{t) = $>/*(/), R(() = y RkA{} y 

k k k 

Under hypothesis H the joint probability density function of the real and imaginary 
parts of the first n of the coefficients, V"" = (V u V 2 , ... , V„\ is 



Po(y in) ) = (2ttNT' ! exp 



I 



2N 



(1M89) 



compare (3-47). Under hypothesis H\ the independent circular complex Gaussian 
random variables V k have expected values R k , and the variances of their real and 
imaginary parts are N(\ + \ k \ where the \ k are the eigenvalues of N~% s (t, u) as 
in (I I -74). The corresponding joint probability density function is thereupon 



P] (V 0,) ) = [2tt/V(1 + k k )r" expj^- £ 



k=i 



\Vk -R k \ 2 

2N(\ + A,) 



(H-190) 



as in (3-49). Dividing and passing to the limit of an infinite number of coefficients, 
we obtain the likelihood ratio 



A(V) = 




D(l) 



I + \ k 



W 2 
1 +Xit 



(11-191) 



where D( ■ ) is the Fredholm determinant (1 1-76). 
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A sufficient statistic for deciding between the hypotheses is therefore U + W, 
where 

= _L v ^'^1 

in fr x 1 + \ k 

is given by (11-35) with the kernel h(t,s) = h{t, s; 1) the solution of (11-53) as 
before, and 

w = j} Re I 0! K *> & = TJT k - < 1M92 > 

As in Sec. 3.3.1, the latter can be written 

• T 



W = ^ReJ Q'{f)V{t)du 



to 

where Q(t) is the solution of the integral equation 

W) = Q<!)+ ~\ T ^{t,u)Q{u)du, 0<t<T. (11-193) 
Jy Jo 

The term U can be generated by any of the methods described in Sec. 11.2.4; the 
term W is the output at time t - T of a narrowband filter matched to the signal 
jV -1 Re Q(t) exp iSlt. The optimum detector is thus a combination of that for the 
stochastic portion Re S(t) exp i(lt of the input and the optimum detector for the 
signal Re R(t) exp iSlt as though this were arriving in the presence of both the white 
noise and the stochastic signal. 

f f .5.2 The Performance of the Optimum Detector 

Calculating the false-alarm and detection probabilities Qo and Q d of the optimum 
receiver again requires the moment-generating functions ho(z) and h\{z) of the op- 
timum statistic under the two hypotheses. The probability of error in the communi- 
cation system is then, as before, 

= &0o + EiU - Grf). 

where £0 and £1 are the relative frequencies of O's and l's. For simplicity we suppose 
the decisions to be based on the logarithm In A[V(t)] of the likelihood functional, 
which equals U + W plus a constant depending on the signal R(t). The statistic 



1 + 



(11-194) 



is compared with a decision level Ufa if U' > Uq, hypothesis H\ is chosen. Under 
hypothesis Ho its moment-generating function is 

h (z) = E{e* u '\ Ho) 
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by (1 1-189). After combining the terms in the exponents and using (B-10) or, what is 
the same thing, the normalization integral for the circular complex Gaussian density 
function, we find 

1 +X* T s(l +r)l**l 2 



h(z) = f] 



_! f + (3 + z)h 



exp 



2AT[1 +(1 +z)k k ] 



(11-195) 



DO + z) 

where D( ■ ) is again the Fredholm determinant and 



(11-196) 



By using the result of Problem 1-12, we find that the moment-generating function 
of the statistic U' under hypothesis H { is 



A,(z) = £(e-= y '|iy 1 ) = 



1 



(11-197) 



These reduce to the moment-generating functions of U as in (1 1-73) when R(t) s 0. 

There are several ways of calculating the term J(z). As in (11-192), we can 
write it as 

•T 



J(z) = 



2iV 



(11-198) 



< f < 7. 



(11-199) 



where Q(t;z) is the solution of the integral equation 

RO) = QUI z) + ^ J^M*, u)Q(u; z) du, 

When the stochastic signal is stationary, <J> 4 .(f, «) - <[>,(/ - u), and possesses a ratio- 
nal spectral density, the method of Appendix A can be utilized. 
If the coherent signal is a sum of sine waves, 

= exp/V,,,/, (11-200) 
in 

and the stochastic signal has a rational spectral density, then as shown in Appendix 
J, the function J(z) can be expressed as J{z) = J\ + J 2 , where 
T ,-,*r, 




e{w m T - w n T), 



//<„■;-) - 1 + ~^ s {w) t 



1 



det 



(11-201) 



(11-202) 



2N det G 

with G the 2/7 x 2n matrix defined in (11-143). Here E is a 2/2-eIement row vector 
whose elements are 

"exp(3; - m m )T- I 



e j = 5>,: 



1 <./ < 2n, 
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and F is a 2w-element column vector whose elements are 

f m Skin 



Fk =y 5*21_ i < k < 2n , 



_ f exp iw m T, 1 < k < «, 



{ea 
1, 

As in Sec. 11.3.4, the (3,- are the roots of 



(11-204) 



1 + ~^sHp) = 0, p = fr, p 2 , ... , 02«, 

which can be transformed into an algebraic equation of degree 2n, and the ja* are 
the 2k poles of $ s (-ip), arranged so that the first n lie in the right half of the ^-plane 
and the rest lie in the left half; n^+n = - *4- The coefficients r m might, for instance, 
be calculated by setting the angular frequencies w m equal to integral multiples of 
Itr/T and taking the Fourier transform of the signal envelope R(t) by means of a 
fast Fourier transform algorithm. In that case 

e{w m T -w n T) = 5 wm 

in (11-201). 

Another method of calculating the term J(z) in (11-196) arises from the 
estimator-correlator approach in Sec, 11.3.5. We write it as 



i|[*'-¥^] 

= 2^1 l * (T) ' 2 dT " 1 1 1 R * (r)h(r > t; z; T)R(i) dr ^ 



J(z) = 

AJV 



(11-205) 



where 

N * I + z\ k 
is the temporal function corresponding as in (11-16) to the matrix 

in (11-124). 

We want to carry through the same type of analysis as led to (1 1-146), but for 
the "signal" S : (t) whose complex autocovariance function is z$ s (t, «)• We need to 
adapt (11-121) for complex z. If the K*(t) are the eigenvalues of the integral equation 

1 r 

; t) = — ^ <fc(r, u)f k iu\ -r)du, 1 < t < t, 

then as in (11-206) 

and interchanging r and / we find 

h{t t r; z; t) = [h{r, t;z*; T )J* = h*(r, t;z*; t). 
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Thus Siegert's relation (11-121) for the derivative of h(r, !\ z; -r)-mth respect to t 
can be written 

fh(r, r;z;r) = ~Nzh*(t, r; z*\-t)h(7 t /; -;t). (11-207) 

Defining 

UM = \ f fV(r)A(r, t;'z; T )fi(/) dr dt 
Jo Jo 

and following the procedure that led to (11-146), we find 
dU C 

~~ = |J2*(t) H{t, t\ z; r)R(t) dt + c.c. 
dr Jo 

n (11-208) 
R I (r)h r (r,t;z;7)R(t)dr dt. 
) 

Remembering (11-117) and using (1 1-129), we define the causal quasi-estimate 

R(t; z) ~ f m(r, u;z; r)R(u) du = Nz f /;(t, «; 2; i)R{ii)du, (13-209) 
Jo Jo 

whereupon, with (11-207), we can write 

Afe f T f T 

A'(t, r; 2*; t)J?*(t)/7(t, /; z; t)JJ(/) 

Jo 



2 

1 [/r(T)A(T; z) + £*(t; z *)R(t) ~ R*(r, z*)A( T ; z)], 



2M 

and integrating over < t < T, we obtain for (1 1-205) 
1 f T 

= 2iVj [ ** (t) " ** (t; 2 * )][ * (t) " 2 >3 rfT - 01-210) 

Whatever means we have for determining the causal estimator S\(t) of the complex 
envelope 5(0 of the signal can be utilized to compute the causal quasi-estimate 
R(r; z) needed in evaluating the function J(z) by this expression. 

When in particular the stochastic signal 5(f) can be generated as the output 
of a linear system driven by white noise, as in Sec. 11.4, the function R(t;z) can 
be determined in the same way as we there showed one can determine the estimator 
S = (t). By (11-209) and (11-173), 

A(t;z) = Cf : (T) 

with ?_-(t) a column vector of n components and C the row vector in (11-157). As 
in (11-176), the former obeys the differential equation 

^ = Ff z (t) + Mt, t)[*(t) - a a (T)], 

with the initial condition f_-(0) - 0. The n xn matrix F specifies the dynamics of the 
linear system as in (1 1-155), and k = (T, t) is again the Kalman gain in (1 1-177). The 
causal quasi-estimate k(r; z) needed in (11-210) can thus be computed by solving 
this set of n differential equations, once the Kalman gain has been determined 
as in Sec. 11.4 by solving the Riccati equation (11-179). This computation can 
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evaluate J(z) and, through (11-181) or (11-184), the Fredholm determinant needed 
to compute the moment-generating functions of the decision statistic V at all points 
of the contours of integration in (1 1-94) and (1 1-95) when evaluating the false-alarm 
and detection probabilities for the receiver of a combination of a coherent and a 
stochastic signal. 

11.0 DISCRETE-TIME PROCESSING 1 
11*6.1 The Optimum Statistic 

In order to process the input v{t) digitally, it can be filtered to remove noise with 
frequencies far from those of the signal to be detected, and the filtered input can 
be sampled at discrete times uniformly spaced. The decision about the presence or 
absence of a signal will then be based on a finite number « of samples V\ f V2, ... , V„ 
of the complex, envelope V{t) of the input. We gather these into a column vector V. 

When as throughout this chapter the signal and the noise are circular complex 
Gaussian random processes, the samples Vk are circular complex Gaussian random 
variables. Under hypotheses Hq and H\ they have zero expected values, their n x n 
complex covariance matrices are <f>o and <J>j, respectively, and 

4>i = <f> + «f>i, 

where 4>s is the n x n complex covariance matrix of the samples 5* of the complex 
envelope S{t) of the signal. 

By the same procedure as in (1 1-30) through (1 1-33) we find that the optimum 
strategy for deciding between hypotheses Hq and H] is to compare the statistic 

U = |V + HV 

with a decision level Vq and to choose hypothesis H\ if U > Uq, and otherwise 
ffo. The level Uq is set to induce a preassigned false-alarm probability Qq = 
Pr(*7 > Uq\ Hq). Now H is an n X n matrix given by 

as in (11-32); it can be computed once for all and stored in the computer that is 
carrying out this discrete-time processing of the input. 
The matrix H can be decomposed as 

H = G + G, (11-211) 

where G is a lower-triangular matrix }}^|| with gy sQ,j > i. This decomposition 
into the lower-triangular matrix G and the upper-triangular matrix G + can be carried, 
out by standard algorithms [Pre86, pp. 31-8], The detection statistic can then be 
written as 

U = ^W + W = ^\Wi\\ (11-212) 
'Chapter 12 does not draw upon the material in this section. 
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where the elements of the column vector W = G V are 

W=^g 9 Vj. (11-213) 

Thus Wj depends only on the present sample V\ and the previous samples V\, V2, 
... , Vi-\, and (11-213) represents a causal discrete-time linear filtering of the input 
samples Vj. By (1 1-212) the statistic V can be computed in real time, provided that 
the computer works fast enough to evaluate (1 1-213) between samplings. 

We assume henceforth that under hypothesis Hq the real and imaginary parts 
of the samples V k = V kx + iV k} , are statistically independent and possess variances 
equal to some number N. For the sake of a less cumbersome notation, we divide 
all the data — signal and noise — by -JN, in effect setting JV equal to 1. Then under 
hypothesis Ho the variances of the real and imaginary parts of the data are 

i£(|K*| 2 | /ft) = Var V kx = Var V ky s 1, (11-214) 

and their covariance matrix 4>o equals the n x n identity matrix I. Furthermore 
4»i = I + and the kernel of the optimum statistic U is 

h = i - a + = *,a + +,r ! . (i 1-215) 

When n » 1, computing the matrices H and G and storing both them and the 
data V may exceed the capacities of one's computer, and more efficient methods of 
generating the optimum statistic U must be sought. To that end we define 

\ (k) = (V U V 2 ,..., V k ) 

for our temporally ordered samples; V = V ,B '. Under hypothesis H\ the joint prob- 
ability density function of the data V can be expressed as 

^i(V)= J pi(KjV<"~ 1 ^i(Vf'- l >) 1 

and continuing thus we write it as 

/>i(V) = mv„\ vW)Mv„-A v<"- 2 >) ... 

• Pi(V k \ \ ik ~ l >) ... Pl (V 2 \ V x ) Pl {V^ ( ] 
Because the data are circular complex Gaussian random variables, 

V ' M>) = ^ p ['ik \ Vk ~ Exm ^""fl (1! " 217) 

where 

W k = \E\\V k - E,(V k \ V<*-'>)| 2 ] (11-218) 

is the conditional variance of the real and imaginary parts of the datum V k , given the 
past input samples Vu Vz> • •• > Vk-i- Here E\ denotes an expected value under hy- 
pothesis H\ . Because V k = S k + N k and the samples N k of the noise are independent 
of those of the signal and of each other, 
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and S k is the rainimum-mean-square-error estimator of the signal sample 5* based 
on the data collected before "time" k . Furthermore, 

W k = - S k \ 2 ) = 1 + \E x {\S k - 5*i 2 ). 

Remembering that the joint probability density function of the data V under 
hypothesis H\ is 

pi(Y) = (2irr(det+,r l expf-iV^'V), 
we see from (11-216) and (11-217) that 

H 

det <f>i = det(i + = f] w *- (11-219) 

k=\ 

Now using (11-217) to form the likelihood ratio for our decision between 
hypotheses Hq and Hu we find that the optimum statistic V can be written as 

fr=! h=\ Wk 

Because Sk depends only on the past data V'*"", this form too can be computed in 
real time. Our task then is to determine that estimator and the quantities W k in an 
efficient manner. 

f 1 .6.2 The Kalman Method 

The amount of data storage and the duration of computations can be limited when 
the sequence of signal samples 5* can be modeled as the output of a linear sys- 
tem driven by white noise, the discrete-time counterpart of the dynamical systems 
considered in Sec. 11.4. The estimates S k can then be calculated by the Kalman 
equations for the one-step linear predictor of the signal samples on the basis of the 
data {V k } [KaI60], [Sch65], [Lar79, vol. 2, pp. 120-9], [Hel91, pp. 523-39], [Pap91, 
pp. 515-24]. The system is described by p circular complex Gaussian state variables, 
whose values at time k we denote by yi[k], yi[k], ... , y p [k] and collect into a column 
vector Y[fc], The system evolves in accordance with the dynamical equations 

\[k + 1] = A(*)Y[jfc] + G(k + 1) e[k + 1], 

where A(k) is a p x p matrix, G(k + 1) is a /^-element column vector, and e[\], 
e[2], ... constitute a sequence of statistically independent "white noise" variates 
whose real and imaginary parts have zero expected values and variances 

Q(k) = \E{\e[kf). 

The signal sample at time k is a linear combination of the state variables, 

S k = D(A)YI*], 

where D(k) is a /^-element row vector. 
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When the sequence of signal samples {S*} is a stationary random process, its 
'covariance matrix <f>. v has the Toeplitz form 

<?>*,{/ = +h. (11-221) 

and one can define for it the discrete-time spectral density 

CO 

F(\V) = <])*H~* t W = e' d , -TT < 6 < 77, 

k =-oo 

[Hel91, pp. 290-6], [Pap91, pp. 332-6]. If in particular the process {S^} is an autore- 
gressive moving-average (ARMA) process, this spectral density is a rational function 
of w: 

W = gj. dl-222) 

where A(w) and B(w) are polynomials in w. The degree of B(w) is 2p; that of 
A(w) is less than or equal to 2p. The parameters of the linear system can then be 
determined much as in Sec. 1 1 .4. 1 . One factors F(w) into a part having its poles 
and zeros outside the unit circle |n'| = 1 in the w -plane and a part having its poles 
and zeros inside the unit circle; see for instance [Hel91, pp. 325—33], The method 
now to be described is more general, however, and it allows the system parameters 
to be functions of the time k and the sequence {S*} to be nonstationary. 
At time k the complex covariance matrix of the state vector Y[k] is 

X(k) = ^(Y[/c]Y + [*]), 

and the specification of the model must include its initial values 5)(1). We shall now 
summarize the equations that define the Kalman method for one-step prediction of 
a scalar random process — not the most general formulation, but one adequate for 
our purposes. Their, derivation and interpretation can be found in the references just 
cited, and in numerous texts on signal processing and Kalman filtering. 
The estimator of the /cth signal sample is given by 

S k = D(A)Y[fc|* - 1], (11-223) 

where 

Y[/c| k - 1] = £,(Y[fc]| \ (k ~ ]) ) 

is the one-step predictor of the state vector Y[/c], given the previous data V\, Vi, ... , 
Vk~\. The Kalman prediction theory determines it by 

Y[k\k - 1] = A(/c - l)Y[/c - 1], 

where Y[/c] is the estimator of the state vector at time k, k - 1, 2, .... It evolves 
continually through the equation 

Y[/c] = Y[/c|/ c -l]+.K(/c)(n- ~S k ), 

where K(k) is a /j-element column vector known as the Kalman gain. This is deter- 
mined in advance by the equations 

K(/c) = W7'P(Jfc| k - 1)D + (A), 
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where 

W k = D(*)P(*| k - l)D + (fc) + 1 (11-224) 

is the variance of the innovation V k - 5*, and ¥{k\ k - 1) is the p x p covariance 
matrix of the error in the predictor Y[A:| k - 1], 

The evolution of the error covariance matrix is described by the equations 

P(A: | k ~ X) - A(k — lMk - l)A + (fc - 1) + Q(k)G(k)G + (k), 

P(A) = [I - K(k)D(k)]P(k\ k-\). . ( - ) 

The initial conditions for this process are 

■S, =0, Y(1|0) = 0, Y(1) = K(I)K,, 
P(1|0) = X(1), 
K(I) = FFf'KDD+d), 

W X = D(1)2(1)D + (1) + I, 
P(1) = [I-K(1)D(1)]2(1). 

At each time k the receiver puts the estimate Sk given by (1 1-223) and the variance 
Wk given by (11-224) into (11-220) and accumulates the sums in that equation, 
obtaining the optimum statistic U at the end of the procedure. 

11.6.3 Stationary Data 

When the sequence {K*} is stationary, the elements of the Hermitian matrix cf> f have 
the form 

\E x {V k V*) = <j> Um = S Am + = S* m + 4£_ A . (11-226) 

The conditional expected values appearing in (11-217) are linear combinations of 
the past data V'*"", and we can write them as 

V k = E x {V k \ V<*~'>) = £ ti k) V k -x- m . (H-227) 

m=0 

The k - 1 coefficients can be calculated recursively by what is known as the 
Durbin algorithm [Dur60]. We need to modify it slightly in order to accommodate 
the complex data and the Hermitian covariance matrix in (11-226); in so doing, we 
follow the exposition in [Bla85, pp. 359-61]. 

By the principle of orthogonality, which we introduced in Sec. 6.1.7, the errors 
V k - V k are orthogonal to the data V\,V 2i ... , V k -\; that is, 

\E\[{V k - V k )V r *] s 0, r = 1,2, ... ,* - 1. (11-228) 

Putting (11-227) into this and using (11-226), we obtain the set of k - 1 linear 
equations 

4>k~r = fk-\-r + X ^V/ 1 . ' = 1, 2, ...,k - 1, 

p=0 
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or 

k-2 

+ x v».y;!f J = 4>r +i . o < / < k - 2. (n-229) 

/n=0 

Initially 

f, =o, K 2 =y 'V„ jf = r^-. 

1 + tf> 

Given the (A ~ l)-element column vector {f* 1 } that solves (11 -229) at the kth 
stage of this procedure, k > 2, one computes the solution for stage k + 1 as 



(11-230) 



where 
with 



~ — — , 

k-2 k~2 
/<r=0 m=0 



The error variance W k in (11-217) is 

= i£]t(n-nxn*-n*)]. 

Because V k is a linear combination of the data V r for r = 1, 2, ... , k — 1 through 
(11-227), this reduces to 

= \Ei[\V k \ 2 - V k V k *] 

by the conjugate of (11-228). Substituting the conjugate of (11-227) and using 
(11-226), we find that this can be expressed as 

k-2 

W k = 1 + <i> - Y. f>n f ^ = 1 + <t>o - (11-232) 

by (11-231). 

The optimum statistic for detecting the stationary signal sequence in white 
noise is thus 

by (11-217). The V k are computed from (11-227) and the W k from (11-232) at time 
k once the solution vector \f^} has been updated from its previous components 
by (11-230) and (11-231). In contrast to the Kalman method of Sec. 11.6.2, whose 
storage requirements are independent of the number n of data, the present procedure 
requires storing both the data vector V'* 1 as its components arrive and the ^-element 
vector that solves (1 1-229), and these vectors eventually grow to length n. The number 
of computations at each stage is proportional to k and the total number proportional 
to n 1 . The number of computations needed to compute the matrix H in the original 
form (1 1-215), on the other hand, will in general be on the order of « 3 . 
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1 1.6.4 Performance of the Detector 

If, as in Sec. 1 1 .2, we ask for the performance of such a discrete-time receiver when 
the input signals have a total energy -q times that for which the receiver was designed, 
we must seek the probability Q d (t]) that the statistic U exceeds U under a hypothesis 
that the complex covariance matrix of the input samples is not <t» ( , but 

<K = <l>o + -n<K (//»,). (II -233) 

It can be computed as in (l 1-94) and (I I -95) from the moment-generating function 
h^iz) of the statistic U under hypothesis H^, and as in (II -68) this is given by 

h^z) = E(e- U: \ #„) = [det(I + z^H)]"'. (II -234) 

Unless the number n is excessive, this moment-generating function can be computed 
numerically at points z on the contour of integration in (l I -94) or (I I -95). The false- 
alarm probability is again Q = &/(0), and the probability of detecting the standard 
signal for which the receiver was designed is Q (ii . = Q ( i(\). 

When under hypothesis Hq the input samples are uncorrelated as in (11 -2 14), 
the calculation of the probability Q e i(f\) can proceed as in Sec. 1 1 .2.2, in the equations 
of which the are now the eigenvalues of the n x n Hermitian matrix <J> T ; the residue 
series will have only n terms. Under this circumstance, the moment-generating 
function of the statistic U is again given by (11-96) in terms of the now finite 
Fredholm determinant 

D(z) = det(I + z<K) = f](l + z\ k ). (1 1-235) 

When n » 1 , evaluating this n x n determinant can strain the capacities of 
all but the largest digital computers. The computation can be expedited if one 
possesses a simple model of the discrete-time random process We examine three 
situations: (1) the signal samples form an ARM A process, (2) they are generated as 
in Sec. 1 1 .6.2 by a discrete-time linear system, and (3) they constitute as in Sec. 1 1 .6.3 
a stationary, but otherwise arbitrary random process. 

11.6.4.1 ARMA process. When the spectral density F(w) in (I I -222) is a ra- 
tional function of w, it possesses 2p poles \l 2 , ... , m-2/j, of which fij, 1x2, ... , — 
the exterior poles— lie outside the unit circle (k| = l), and the rest, u, />+) , ... , 
y<2p — the interior poles — lie within it. Here we are confronted with an ARMA 
process that is the discrete- time counterpart of the random process S(t) treated in 
Sec. 11.3.4, and it can be shown that for n > 2p the Fredholm determinant can 
be expressed in a form much like that in (11-142) [Hel89bJ. We write the spectral 
density F(w) as 

F(w) = R + F'(w), 

where F'(w)~ not a derivative! — goes to zero as w goes to or to oo. The term 
R = F(0) represents a "white" component of the signal process and is ordinarily 
zero. 

Defining 0i, (3 2l ... , fop as the 2p roots of the equation 

\+zF(w) = Q, (11-236) 
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we lake Pi, pi p /; as those roots ih at go into the p exterior poles and P /;+ i, 

fcp as those that go into the/? interior poles as = — + 0. Then the Frcdhohn 

determinant is given by 

D[=) = i\+R:r f] M.*^~7~, (H-237) 
1 1 del OdU) 



where G„(c) is a 2p x 2/? matrix whose elements are 
f & 



0,7. - 



1 < k < 2p, 



(11-238) 



p + \ <j < 2/j, 1 < k < 2p. 



. P* - fV 

The elements of G () (r) are obtained by setting n = 0. When n » 1, it is advisable 
to factor p" out of the /th column of del G„(-) for 1 <j < p in order to avoid 
overflow. 

As an example, suppose that {S*} is a discrete-time Gaussian Markov process 
so that its covarianccs have the form 



Its spectral density is 



!«|<1. 

(ir - a)(l ~ air)' 
and R = 0, jii = of 1 , jxi - a. Then (1 1-236) becomes 



(11-239) 



F(ir) = 



3 + 



= 0, # = C(l - cr)r, 



(ir — a)( 1 — air) 
which reduces to a quadratic equation 

au' 2 - (1 + a 2 + K)w + a ~ 
and possesses two roots p } and (3i = [37': as r — »• 0, Pi — » cf 1 and po a. 



(11-240) 



The matrix in (i 1-238) is now 



G„(r) 



Pi 



3? 



Pi - a ' . p2 ~ «' 
1 1 



(11-241) 



Pi — a [3^ — a 

and after some algebra we find from (1 1-237) the Fredholm determinant 
n, „ d -aN 2 pr' ~(i -aPi)W 

^>-° (T^xpT^ " 

The roots Pi and p? are functions of z through (1 1-240). 

When ~ moves away from the origin along the negative Re --axis, the roots (3] 
and p? eventually meet the unit circle, and we can put 



(11-242) 



Pi = e'\ p 2 = e 
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into (U-242). By equating the result to zero, we obtain an equation for 6 from 
whose roots the eigenvalues \* of $ s can be calculated: /)(— = 0. 

11.6.4.2 Kalman method. When the signal process {S k } can be modeled as 
the output of a discrete-time linear system having a finite number of state variables 
and driven by white noise, the Fredholm determinant can be computed by solving 
the Kalman equations, much as in Sec. 11.6.2. The ARMA process is a special 
case. As in Sec. 11.4, we imagine the signal process {Sk) replaced by one with a 
power z times as large, so that throughout is replaced by zfo. We can then use 
the Kalman formulation, except with the Q(k) replaced by zQ{k) and the initial 
covariance matrix X(l) replaced by z£(l). By (11-219) the Fredholm determinant is 
then 

Z)( 2 ) = det(I + z<K.) = f]^, 

k=\ 

where the Wk are again given by (11-224) in the modified system, 
W k = 1 + D(k)P(k\ k - l)D + (&), 

with now 

V{k\ k - 1) = A(Jt - l)P(Jt - l)A(fc - 1) + + zQ(k)G(k)G+(k) 

in place of (11-225). In the initial conditions given at the end of Sec. 11.6.2, 5(1) is 
replaced by z£(I). The equations of Sec. 11.6.2 involving the state vector Y[k] and 
the data are of course omitted from this computation of the Fredholm determinant. 



11.6.4.3 Stationary signal processes If the signal process is merely station- 
ary, its covariance matrix having the Toeplitz form (11-221), the Fredholm deter- 
minant can be computed by an extension of the Durbin algorithm of Sec. 11.6.3. 
Again the signal sequence is replaced by one z times as strong: 

<)>* —> zfa, < k < n - 1. 

A complication arises, however, when both z and 4>/c for k > are complex, 
for the method of Sec. 11.6.3 involves taking complex conjugates of the components 
of the solution of (11-229). It is now necessary to solve not one set of equations 
like (11-229), but two, one with the covariances <(>*, the other with their complex 
conjugates 4>I • These equations are 

tf ] + i Y ^ m f^ - z«j>, +1 , Q < t < k - 2, (11-243) 
g\ k) + z £ ti-mgS* = 2<tf + i, < t < k - 2. (1 1-244) 

in=0 



Initially 



M) _ z $i (2) _ z<t>i 

io - * , _. » So 



l+zfo' 6U l+zct> ' 
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The two solution vectors are now updated in accordance with the prescription 
J? + i) =tf k) -&S*-2-r, 0<r<k-2, 

ik + \) + - a' 



with 



P* - i , _< — » PA" " 



where 



1 + z 4>o - zi'k ' ' 1 + r 4) ~ -fv ( 1 



(11-245) 



Hf=0 JJJ = 

The rk's can be computed recursively as 

+ i ^ r* + M+J " U) - r A . + - IV), = 0. 

Observe that updating the solution of (11-243) involves the solution of (11-244) 
and vice versa. In the course of the procedure, one accumulates the Fred holm 
determinant 

n 

D(z) = Y\(\ +s4>o-:rk) (1 3-246) 

or, if there is danger of overflow, one accumulates its logarithm. If z is real, 

ft r J r i 

this algorithm reduces to that of Sec. 11. 6. 3, and only one set of equations needs 
to be solved. Having an algorithm to compute the Fredholm determinant D(z) for 
both real and complex values of z, one is in a position to compute the detection 
probability Q<i(t\) and its complement as in (11-94) and (11-95). 

Problems 

11-1. Find the moment-generating function of the quadratic form U M = ^V + HV when the 
components of the M-dimensional column vector V are circular Gaussian random 
variables with expected values S k and covariances as given in (11-29). 

11-2. Using the result of Problem 1 1-I, show that if V(i) is a circular Gaussian random pro- 
cess of expected value S(t) and complex autocovariance function <fy(t, n), the moment- 
generating function of the quadratic functional 

U = \ [ [ V*{t)h{t,s)V{s)di ds 



where h (z) is the moment-generating function for the process of expected value zero, 
and L(/, w; -) is the solution of an integral equation similar to (11-70). 
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11-3. For an arbitrary quadratic functional of the type given in (1 1-63) calculate the effective 
signaf-to-noise ratio defined in (4-66). Show that this effective signal-to-noise ratio is 
maximum when h(t, s) is equal to the kernel k 9 (t, s) of the threshold statistic as given 
by (11-50). Use the Schwarz inequality for traces given in Sec. II. 1.10. This problem 
is most easily solved in the vector-matrix domain. 

11-4. Verify that (11-169) is the solution of the steady-state variance equation (11-164) for 
the example treated in Sec. 11.4.1, and show that (11-167) and (11-168) yield the 
spectral density in (11-158). 

11-5. For the Lorentz spectral density (1 1-47) show that (1 1-142) reduces to the Fredholm 
determinant in (11-128). 

11-6. Set up state equations and determine the matrices F, G, and h K for a stationary process 
x\(i) whose spectral density is 



taking C = (1 0). 

11-7. Let x(t) be a low-pass Gaussian stochastic process with expected value zero and 
autocovariance function $(t,u) - E[x(t)x(u)]. Define its average power during the 
interval (0, T) as the random variable 



(a) Using the methods of this chapter, derive the moment-generating function h(z) - 
E[exp(~zy)] of this average power in terms of the Fredholm determinant associ- 
ated with the autocovariance function tf)(r, «). 

(b) Show how the moment- generating function h{z) could be calculated from a state- 
variable model of the generation of x(t), as in (11-155) through (11-157). 

(c) Calculate the moment-generating function h(z) when x(t) is a Gaussian Markov 
process with autocovariance function <j> exp(-u,|/ - u |). 

(d) Calculate the conditional moment-generating function 



when the Gaussian Markov process x(t) is generated as in part (b), but starts at 
x(0) = at time t - 0. 

Hint: Use Collins's method described in Sec. 11.4.2 for calculating the 
Fredholm determinant as in (11-184), taking the initial condition L(0) = on 
the state-vector covariance matrix L( - ), which, because this is a one-dimensional 
process, is a scalar and represents the instantaneous variance of x{t). 



11-8. A narrowband Gaussian stochastic signal 

s{t) = Re 

is to be detected in white Gaussian noise of spectral density N. The real and imaginary 
parts of S(t) are independent Wiener-Levy (or Brownian) processes; that is, 



*,(») = 



g(o> 2 + a 2 ) 



(a) 2 + 6 2 )(ai 2 + c 2 y 




h(z) = E(e~ :} '\ jc(0) = 0) 




where W(t) is white Gaussian noise of spectral density R: 
{E[W{t)W\s)] = Rh(t -s). 
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Thus 

ttS 

-r = W(0, < / < r, 5(0) = o. 
a? 

Think of this as the dynamical state equation for a linear system with a single state 
variable x\(t) - S(t). The input v(i) = Re K(/)exp/fl? to the receiver is observed 
during the interval < t < T, and either (H ) V{t) = N{t) or (//,) ^(0 = N{t) + 
S(l). The input noise is Re jV(/) exp t'Cli. 
(a) Calculate the complex autocovariance function 

<M/,u) = ±£[S(/)S»] 

of the signal. 

(bj Use the Kalman-Bucy equations to express the minimum-mean-square-error 
causal estimator S t (i) of the signal under hypothesis H\ in terms of the input 
V(t'), < t' < t < T. 

(c) Derive and describe the causal estimator-correlator form of the optimum detector 
for this signal. 

(d) Starting with the solution of the Kalman-Bucy equations or otherwise, derive the 
kernel h(i, s) of the optimum detection statistic U for this signal. 

(e) Calculate the moment-generating functions 

h,(z) - E{e~ zU ] Hi), i = 0, 1, 

of the optimum detection statistic under both hypotheses and show how to use 
them to compute the false-alarm probability O and the probability {?,/ of de- 
tecting the signal s(t). Use Collins's formula (11-184) to calculate the Fredliolm 
determinant involved. 

(f) Determine the eigenvalues of the autocovariance function ^ s (t, u) of the signal 
during the interval (0, 7"), and express the false alarm and detection probabilities 
Qo and Q,/ in terms of them and of the decision level t/ on the statistic U . 

11-9. A binary on-off communication system transmits a certain signal for each 1 in the 
message, and it sends nothing for each 0. The digits appear every T seconds, and 
0's and l's are equally likely. The signal is received as a pulse .?(/) of stationary 
narrowband Gaussian noise of duration T. The narrowband spectral density of this 
stochastic signal is 



C a constant. The input to the receiver always contains white Gaussian noise of 
unilateral spectral density N. 

(a) Describe the optimum receiver for deciding every T seconds whether a signal has 
arrived or not, taking "optimum" in the sense of "attaining minimum probability 
of error." 

(b) Calculate the minimum attainable probability P„ of error in the limit \lT <k 1, 
as a function of E/N, where E is the average energy of the received signal .?(/). 
Plot P e versus E/N for 20 < E/N < 1000. 

(c) Calculate the minimum error probability P c for » 1 by making the equal- 
eigenvalues approximation. That is, as in (11-1 12) assume that the autocovariance 
function of the signal has m equal eigenvalues \k approximately equal to X and 
that the rest are negligible. As in Sec. 1 1.2.3, determine \ and the effective number 
m of eigenvalues in terms of u.7. For in = 10 plot the error probability P e versus 
the ratio E/N over the same range as in part (b), 
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(d) Use the Toeplitz approximation in Sec. 11. 2.5 to calculate the moment-generating 
functions E[exp{-zU)\ H Q ] and £[exp(-zt/)| H{\ of the optimum decision statis- 
tic U. Then use the saddlepoint approximation to calculate the minimum error 
probability P e for the same range of values of E/N as before; m = 10. Compare 
your results with those of part (c). 

(e) Describe the threshold detector for these stochastic signals. Under what condi- 
tions, in terms of E/N and jj-T, would you expect it to be nearly as good as the 
optimum detector, and why? 

(0 Describe in detail how to implement the Schweppe likelihood ratio receiver for 
these signals. Write out in explicit form, component by component, all the 
Kalman-Bucy equations needed for estimating the incoming signal, and show 
how their solution is used in the receiver. It is unnecessary to solve any of the 
differential equations. 

11-10. In certain on-off binary communication systems nothing is sent for each in a mes- 
sage; for each 1, signals proportional to Re F(t) exp itlt and Re G(t) exp '/ft/ are 
transmitted simultaneously. The input to the receiver when these have been sent (hy- 
pothesis H\) is v{t) = Re V{t) exp i£lt, with a complex envelope 



N(t ) is the complex envelope of white Gaussian noise of unilateral spectral density N. 
Because the channel fades, the amplitudes A and B are independently random and 
have Rayleigh distributions with the same parameter a 2 : 



The phases 8] and 8 2 are independently random, independent of A and B, and uni- 
formly distributed over (0, 2ir). Take 



Find the optimum receiver for deciding between hypotheses H Q , "0 sent," and Hi, 11 1 
sent," and calculate its probability of error, assuming the symbols and 1 to occur 
with equal prior probabilities. The matrix methods of this chapter will be found 
helpful, as will (D-9). 

We have simplified this problem by assuming that even though the signal en- 
velopes AF(t) and BG(t) overlap, they fade independently. A more realistic model 
would assign a correlation coefficient to the received complex amplitudes A exp iB\ 
and B exp 162. Describe how that could be done and what modifications correlated 
fading would require in your analysis. 

11-11. A narrowband signal s(t) - A Re[F(t - t) exp(/H/ + i<j))] of unknown amplitude A, 
phase (f>, and arrival time t is to be detected in the presence of additive white Gaussian 
noise of unilateral spectral density N. The input v{t) to the receiver is observed during 
an interval (0, 7") that is much longer than the reciprocal of the signal bandwidth W; 
WT y> I. Assume that the arrival time t of the signal is well within the interval so 
that you can neglect the possibility that the signal overlaps one end of the interval or 
the other. 

The threshold detector for this signal has a filter matched to the signal 
Re F(t) exp iSlt over an interval < / < V <£ T, outside of which we can assume 



V(i) = AF(t) e n> + BG{t) e ii>2 + N(t), 



< / < T; 
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the signal to be negligible. The output Re Va(t) itit of the filter is applied to a 
quadratic rectifier, whose output is in turn integrated during the interval (7"', T + T') 
to generate the threshold statistic 



where V(t) is the complex envelope of the input to the receiver. 

(a) Using the methods of this chapter and the results of Problem 11-1, determine 
the moment-generating functions h,-(z) (i = 0, 1) of the statistic U under both 
hypotheses (Ho) "signal absent" and (H\) "signal present with a given arrival 
time to, <3C tq <c T." 

(b) Describe how you would use these moment-generating functions to calculate the 
false-alarm probability Q Q and the probability Q tt of detecting this signal. 

(c) Work out the Toeplitz approximations to the moment-generating functions h-,(z) 
of U in terms of the spectrum f(ta) of the complex envelope F(t) of the signal. 
Then state their forms when the signal is bandlimited to a bandwidth of W hertz, 



and determine the false-alarm probability Qq and the detection probability Q d 
for this special case. Express the latter in terms of the signal-to-noise ratio D 2 - 
2E/N, E the energy of the received signal, and the time-bandwidth product WT. 



H-12. In the system treated in Sec. 11.5, assume that the phase of the coherent signal is 
unknown; that is, instead of its being Re R(t) exp iCli, it is Re R(i) exp(/Il/ + t<j)), 
with tjj uniformly distributed over (0, 2tr). Determine the optimum detector for the 
combination of this signal and a stochastic signal Re S[t) exp iClt of the same type 
as postulated in Sec. 11.5. Assuming that the average total energies of both signal 
components are proportional to the same positive factor s, work out the threshold 
detector for their combination by taking the likelihood functional to the limit s — * 0. 
Calculate the effective signal-to-noise ratio— the deflection (4-66)— of this threshold 
detector. Hint; Use (4-75). 

11-13. Plot the locus of the roots (3a- of (11-137) for the spectral density given in (A- 17). 

11-14. Show in detail how to calculate the n eigenvalues \* of the Toeplitz matrix 4> AS where 
<|>. v is given in (11-239); use (1 1-242). Calculate those eigenvalues for a = 0.5, n = 8, 
and C = 1. 

1 1-1 5. Write out in detail the steps of the modified Durbin algorithm at the end of Sec. 1 1 .6.4 
for k = 1 and k = 2, and verify that they yield the solutions of the simultaneous 
equations (11-243) and (11-244) and that (11-246) provides the correct, value of the 
determinant det(I + z§ s ) for - 2. 
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12 



Detection of 
Optical Signals 



12.1 PHOTOELECTRON COUNTING 

12.1.1 Properties of the Light and the Detector 

An optical signal is a pulse of electromagnetic waves whose frequencies lie in the 
infrared or visible range. It may be carrying information, as in an on-off binary 
communication system in which a pulse of light is transmitted for each 1 in a mes- 
sage, the absence of a pulse indicating a 0. In optical radar the presence of a target 
is manifested by the reception of a light pulse reflected from it. The simplest method 
of detecting an optical signal is to focus its source onto a surface that emits pho- 
toelectrons and to count the number of electrons emitted during a fixed interval of 
time (0, T). Sections 12.1 and 12.2 will concentrate on this kind of detection. It is 
not necessarily the optimum procedure; information contained in the time at which 
each electron is emitted may enhance the detectability of certain types of optical 
signal. Such detectors will be briefly considered in Sec. 12.5. In Sec. 12.3 we shall 
examine receivers utilizing photomultipliers or basing their decisions on a sample of 
the current at the output of a photosensitive detector, and in Sec. 12.4 the optical 
counterpart of a heterodyne receiver will be treated. 

The light is admitted to an optical receiver through an aperture. The larger the 
aperture, the more light is collected and the greater the probability of detection. The 
very best receiver would base its decisions on an observation of the electromagnetic 
field at its aperture during the interval (0, T). To formulate the optimum processing 
of that aperture field requires taking account of the laws of quantum mechanics, 
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which govern electromagnetic fields at optical frequencies. A decision theory con- 
sistent with quantum mechanics has been formulated; it is outlined in [Hcl76]. It 
shows that in. many situations the optimum detector counts photons; in others, it is 
unknown how physically to realize the optimum processing of the aperture field that 
the theory prescribes. 

The light passing through the aperture of the receiver falls perpendicularly onto 
the photoelectrically emissive surface, which we shall call simply the detector and 
which has a tola! area A. The light will be assumed linearly polarized, whereupon 
its field can be treated as a scalar function of position and lime. It will be assumed 
that the light field is completely coherent over the entire area A of the detector; it 
consists of plane electromagnetic waves. Any background light will be assumed to 
have been filtered to remove components far from the frequency of the light to be 
detected. The incident light can then be represented as a narrowband signal of the 
form 

v{t) = Re V{t)e'' n ' 

and with complex envelope V{t) and angular carrier frequency il; fi - 2ir/ and 
/' = c/K where c is the velocity of light and X is the wavelength of the radiation. 

According to the semiclassical theory of light, the field v{t) can be treated as 
a narrowband Gaussian random process. It may have a nonzero expected value 

E[v{j)\H x ] = Re £(/)<?'"' 

under hypothesis H] that a signal is present; S(t) is the complex envelope of a 
coherent signal that might, for instance, be the output of an ideal pulsed laser. We 
shall assume that the random portion of the light is a stationary process; indeed, 
considering how natural light is created, it is difficult to imagine circumstances under 
which it would not be so. We assign to it the complex autocovariance function 

\E{[V( t) ) - S(t } )][V*(t 2 ) - S'(t 2 ))} = P<M/i - '2). (12-1) 

The Hermitian function <J>(t) - <i>*(-T) will be normalized so that <j>(0) = 1; P is 
then the total average power of the random, or incoherent, portion of the incident 
light. The Fourier transform 



<P(<o) = 



<Kt) (T /WT </t (12-2) 



will be called the spectral density of ihe light, <o representing an angular frequency 
deviation from the carrier frequency 0. In the optics literature <J>(t) is called the 
temporal coherence function of the light [Man65]. 

When this light falls on the detector, it ejects photoelectrons. The conditional, 
probability that one electron is ejected in a brief interval A/ is \(/)A/, with 

HO = ^(Ol 2 (12-3) 

the instantaneous rate of emission; y is called the quantum efficiency of the detec- 
tor, < V < 1, h ~ 6.626 ■ I0~ 34 J-sec is Planck's constant, and/ '= n/2-rr is the 
frequency in hertz; hf is the energy of a single quantum or photon of the light. One 
can think of y as the probability that an incident photon ejects an electron that is 
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counted in the external circuit of the detector. We abbreviate v\'/hf by tj. Behind 
(12-3) lies the assumption that the incident light is not so strong that it significantly 
depletes the number of electrons available in the material of the detector. 

The probability XA/ is conditioned on a particular realization of the incident 
light field, which is a random process. The expected number of photoelectrons 
counted in (0, T) is 

E(n) = n = n + n s> 
where T 

n = \t\E [ j V(t) - S(t)\ 2 dt = i\PT (12-4) 

is the expected number due to the random component and 

n s = ^C\S(t)\ 2 dt (12-5) 
Jo 

is the expected number due to a coherent component of the light. 

Conditionally on a realization v(t) = Re V(t) exp iSU of the incident light 
field, the emission of photoelectrons is assumed to be a Poisson point process. The 
probability that k electrons are ejected between times t\ and h has the Poisson form 

m k C' 2 
Pr(« = k\ V(t\ h<t<t 2 ) = —e~ m , m = \(t)dt, (12-6) 

kl J tl 

and the numbers ejected in disjoint intervals are statistically independent [Sny75, 
pp. 38-56], [Hel91, pp. 389-91], [Pap91, pp. 354-8]. We shall be concerned in this 
section only with the total number n ejected during the observation interval (0, T). 
The probability that k are ejected is then 

Pk = Pr(« = k) = Z^e~ m J m = V(0I 2 dt. (12-7) 

The expectation E is taken with respect to the distribution of the variable m, which 
is random because V(t) is a circular complex Gaussian random process. 

12.1.2 The Probability-generating Function 

In detectors of this kind the presence of a signal is indicated when the number n 
of electrons exceeds a certain decision level «o- The probabilities of false alarm 
and detection are then related to the complementary cumulative distribution of the 
random variable n, and this distribution is in general most directly calculated on the 
basis of the probability-generating function of the number n of ejected electrons. One 
often refers to such distributions as photocount distributions and speaks of photon 
counting, although it is not photons, but photoelectrons, that are being counted. We 
shall need only the elementary aspects of this subject. More extensive treatments 
can be found in [Sny75] and [SaI78]. For descriptions of optical communication and 
optical radar systems and technical details of their construction, see books such as 
[Gag76], [Gow84], [Sen85], and [Jei92]. 
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The probability-generating function of a nonnegative integer-valued random 
variable such as the number n of photocounts is defined as in (5-40), 



h(z) = £(.-") = ^p k z k 

,i=0 

with p k - Pr(/7 = k). The probabilities p k - can be recovered from h(z) by 



The cumulative distribution of the number h is defined as 

*- 1 

Pr(/2 <*) = 



*7* 



/■=0 



and as shown in Sec. 5.2.3, it is given by the contour integral 

r-*A(=) <fc 



to = 



1 - z 2ir/ ' 



(I2-8) 



(1 2-9) 



(12-I0) 



(12-H) 



where C- is a closed curve surrounding the origin, but enclosing neither the point 
_ - ! nor any singularities of h(z). Taking C on the other hand as a closed curve 
C+ including both z = and z = I, but no singularities of h(z), we can write the 
complementary cumulative distribution of the number n as 



= 1 - ilk = X ' J '' = 



„-A- 



G - 



h(z) dz 



1 2tt/ 



(12-12) 



For complicated distributions for which the probabilities /?/, cannot easily be calcu- 
lated or for numbers k of electrons so large that (12-10) cannot be summed accurately 
enough, the cumulative probabilities q^ and q\ n can often be conveniently computed 
by numerical integration of (12-11) or (12-12) along a suitably chosen contour. In 
particular, saddlepoint integration similar to that described in Sec. 5.2.3 may well 
be expeditious. When the number k is far in one tail or the other of the distribution 
of the number n of counts, the saddlepoint approximation introduced in Sec. 5.3.2 
is useful. 

12.1.3 Evaluating the Probability-generating Function 

The probability-generating function for the Poisson distribution with expected value 
m is 



(12-13) 



by (12-8). For the number n of photoelcctrons ejected by the type of light treated in 
Sec. 12.1.1, we find from (12-7) the probability -generating function 



//{-) = iT-jexp 



rT 



V(!)\ 2 dt 



(12-14) 
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in which V(t) is the complex envelope of a narrowband Gaussian random process 
with expected value Re[5(f) exp tilt). In order to evaluate h(z), we introduce the 
eigenvalues \ k and the eigenfunctions/*(0 of the temporal coherence function 4>(t): 

1 ( T 

hfk(t) = j j <f>(> - «)/*(*) du, 0<t<T. (12-15) 

Because 4>(0) = 1, the eigenvalues k k sum to 1. 
Define the Fourier coefficients 

Vk = \ T fkit)V{t)dt, 

S k =E(V k ) = f f k *(t)S(t)dt. 
Jo 

Then the Vk are independent circular complex Gaussian random variables whose 
real and imaginary parts have variances Pk k T: 

tE[\V k -S k \ 2 ] = P\ k T. 

The joint probability density function of these parts has the circular Gaussian form — 
see (3-49)— 

Furthermore in (12-14) 

\ T \v(t)\ 2 dt = £ini 2 , 
Jo r 

and the probability-generating function can be written 

k J 

with 

£[exp[^(z-l)|^| 2 ]] 

= sab f J> p h - iml2 - 

= 1 ex f hfr-Dlfrl 2 1 

1 - w Xjt(z - 1) CXP [ 1 - n Mz - I)} 

The probability-generating function of the number of photoelectrons is therefore 

*« = n - ■)! ,-,!£-,) ]• (12-1© 

If we define the Fredholm determinant associated with the kernel <Ht) of (12-15) as 
in (11-76), 

D(x) = ["[(I + (12-17) 
the first factor in A(z) is [D(n (\ - z))}' 1 . 

474 Detection of Optical Signals Chap. 12 



Replacing «oO — 2 ) by we write the quadratic form in the exponent of 
(12-16) as 



It 



k . + kkX k * 1 + KkX ' 

and to (i + \kx)Qk = 5* corresponds the integral equation 



<K' - u)Q(u; x) du, 0<t < T, (12-18) 



as in (11-193). Our probability- generating function thus becomes 

H{Z) = ^) eH= ~ XmX) ' * =*oU (12-19) 
with ^ 

J(x) = f S*(t)Q{t;x)dt (12-20) 
Jo 

as in (1 1-198). When the spectral density 3>(a)) is a rational function of the frequency 
w, all the methods developed in Sees. 1 1.3 through 1 1.5 can be applied to evaluating 
(12-19). We begin with simple cases. 

When the complex envelope V{t) of the incident light is written as a Karhunen- 
Loeve expansion in terms of the eigenfunctions fu{t) of the autocovariance function 
4>(t), defined by (12-15), the individual terms are often called temporal modes of 
the light fieid. If the light, instead of being linearly polarized as we have been as- 
suming, is unpolarized, its incoherent part can be decomposed into two statistically 
independent components, each of which can in turn be broken into a set of statis- 
tically independent temporal modes whose coefficients V™ and V"' have the same 
statistical properties. The number of modes is in effect doubled, and the factor 
[D(x)]~ l in (32-19) is simply replaced by [D{x)]~ 2 . If the incoherent portion of the 
light is partially polarized, it can again be decomposed into statistically independent 
components, but these will account for different expected numbers Tiq and Tiq of 
photoelectrons [Hel64]. One then replaces D(x) in (12-19) by 

D( X] )D(x 2 ), Xi = «o°(i - z), < = 1, 2. 



12.1.4 A Single Temporal Mode 



When the product WT of the bandwidth W of the light and the observation time T 
is small, WT « 1, a single eigenvalue \| of (12-15) predominates, \i « 1, and the 
rest are negligible. The random component of the light is then completely temporally 
coherent and can be described during (0, T) by a single circular Gaussian random 
variable V\. The magnitude \V\\ has a Rayleigh-Rice distribution as in (3-71). This 
temporal coherence should be distinguished from that characterizing the output of 
an ideal laser, such as would be represented by the signal s(t) = Re S(t) exp iilt. 
For the latter the magnitude of the field is fixed; for the former it is a random 
variable. 



Sec. 12.1 Photoelectron Counting 



475 



In the absence of a deterministic component S{t), the probability-generating 
function of the number n of photoelectrons ejected during (0, T) when this light 
strikes our detector is, by (12-16), 

h(2) ~ i — =4 T\> 

1 - K (2 - 1) 

where n Q - t\PT is the expected number of electrons. The probability p r of counting 
r electrons is then 

Pr = (1 - v)v\ r = 0, 1, 2, ... , v = < I, 

and the cumulative distributions are 

?r=^, ^' = 1-^. * 2to. 

This is called the "geometric" or sometimes the "exponential" distribution. 

When a signal component is present and ejects an expected number n s of 
electrons during (0, T), the probability-generating function is, by (12-16), 

= 1— =7 n expf.^V 1} n l (12-21) 

1 - n (z - 1) LI - n Q (z - 1)J 

and one finds by using the generating function of the Laguerre polynomials L r (-) 

that r _ 2l 

Pr = (1 - v) e-^'-VL, (12-22) 

[Erd53, vol. 2, p. 189, eq. (17)]. These polynomials obey the recurrent relation in 
(C-6). 

12. 1.5 Many Temporal Modes 

Comparing (12-21) and (12-16) we see that the kth temporal mode V k f k (t) can 
be thought of as accounting for a random number n k of photoelectrons; the total 
number ejected in (0, T) is the sum 

n =^n k , 

k 

whose terms n k are independently random. The number n k has a Laguerre distribu- 
tion as in (12-22), 

Pr(«* = r) = (I - v k )exp[-R k (\ - v k )}v k L r \~ Rk{l ~ Vk) 1 

L Vk J (12-23) 

I + n Q k k 1 

with «o - f\PT the total expected number of counts arising from the incoherent part 
of the light and 

«, = 

the total expected number arising from the coherent part. 
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The probability distribution {p,} of the total number n of electrons can be 
computed by successively convolving the distributions in (12-23) until the expected 
value n \ k + R k associated with the latest convolvendum becomes negligible. The 
cumulative distribution can then be calculated by summing the p r \ Alternatively 
one can put z ~ exp(-/u>) into (12-16) and compute the probabilities p r by means of 
the fast Fourier transform of /i[exp(-iw)]. These methods will be feasible if neither 
the number of significant eigenvalues \ k nor the number k of electrons for which 
is needed is excessive. 

When the light has a Lorentz spectral density 



and there is no coherent component, the probability-generating function of the num- 
ber of electrons is 

h(z) = e^hg + j[| -f- j]sinh^J, 
m = \lT, g - |m 2 - 2mHo(z - l)]'^ 

by (12-19) and (11-128). Bedard described a recurrent technique for evaluating the 
probabilities p r for this type of incident light [Bed66]. 

12.1.6 Tht* Toeplitx Approximation 

When the time-bandwidth product WT is large, the Fredholm determinant D(x) 
involved in (1 2- 16) will be 



as in (11-102). The limits of integration in (12-18) can be replaced by -co and co 
and Q(t\ x) by Q M (;; x), where 

x f 00 

S(t) = Qcoit; x) + ~\ <|){f - «)&»(«; x) du. 

' J-co 

This equation can be solved by Fourier transformation to yield 



J-co 



i + ( X /T)>&((oy 

where 

/-DO 

*((!>) = S(t)e iu>t dt. 

J— CO 

Then in (12-19) 

j-co 2tt 

and we obtain for the probability-generating function of the number of electrons, 
approximately, 
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h{z) a expj-T f in fl + ^0(w)1 ^ 

£ J-co ^ 

Computing even approximate cumulative distributions and q™ from this 
probability-generating function will usually require numerical contour integration 
as in (12-11) and (12-12). 

12.1.7 Rectangular Spectral Density 

When the spectral density of the random component of the light is rectangular with 
bandwidth W, 

*(co) = \ , , „ r (12-26) 

(.0, M > -nW, 

and ffT » I, the Toepiitz approximation (12-25) yields for the probability- 
generating function of the number of ejected electrons 



h{z) = + ™iTTW=T)\> (12 27) 

with «o and n s the expected numbers of electrons ejected by the random and the 
coherent components of the light, respectively. The parameter N is the expected 
number of electrons ejected by each significant temporal mode of the random com- 
ponent of the incident light. Here 

it is assumed that all frequency components of both the signal and the noise outside 
the band —a IV < to < tsW have been eliminated by filtering at the input. 
We can write (12-27) as 



,^ n-^r I' m* -p)fr - m „_ N 
h(z) = [r^\ exp l — r^— } v 'vtn- 



(12-28) 



Then by the generating function of the associated Laguerre polynomials zJi 0) ( ■ ) we 
can write the probability p T that r electrons are counted as 

p, = (1 - vf exp[-«. t (l - vy9 f I^-^^LZI £ J (12-29) 

[Erd53, vol. 2, p. 189, eq. (17)]. These associated Laguerre polynomials obey the 
recurrent relation given in (C-22). 

In the absence of a coherent component, n s = 0, the number of photoelec- 
trons counted has a "hypergeometric" or "Bose-Einstein" or "negative binomial" 
distribution, rv^_L. \ 

y r\Y{M) 
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with M = WT and v as in (12-28). The time bandwidth product WT need not be 
an integer. The use of this distribution as an approximation for light with other 
spectral densities was proposed in [Bed67]; one can lake the effective bandwidth 
W as defined in (11-93). The approximation loses accuracy in the far tails of the 
distribution when the spectral density <JE>(w) much departs from the rectangular form. 

When the time bandwidth product WT is extremely large, whatever the form 
of the spectral density <P(w) of the light, the term 0>(w) in (12-25) becomes very 
small, and wc find approximately 



h(z) = cxp[(«„ + n,)(r - I)], n = r)PT, n, = ij \ s (u)\ 2 — . 



The distribution of the number of electrons then takes the Poisson form in (12-6) 
with m ~ + 7/ s . The integration time T in (12-14) is now so long thai 



equals its expected value PT within a vanishingiy small fiuciuation, whereupon the 
probability-generating function in (12-14) takes the form of (12-13), and that leads 
to the Poisson distribution. 

12.1.8 Light with a Rational Spectral Density 

When tiic spectral density 4>M of the light is a rational function of the frequency 
so that the temporal coherence function (|>(t) has the form in (1 1-103), the Fredholm 
determinant D(x) in (12-19) can be calculated as in ( l 1-142). The matrix G = G(T) 
figuring there is as written in (1 1-I43). As in Sec. II. 3.4 the parameters !</.-< 
2u, are the poles of the function I + xT' ] ^>(-ipl the first // (fx,, ... , p„) lying in 

the right half of thc/;-planc. the rest (ji, (+i ^ 2ll ) in the left half. The parameters 

Pa, I < k < 2/;, are the zeros of that function. 

If the complex envelope S[t) of the coherent component of the light can be 
written as a sum of sine waves. 



then the function J{x) appearing in the probability-generating function (12-19) can 
be written as in Sec. 1 1.5.2: 

where by ( 1 1-201 ). mutatis nuitawli.w 





r 



//(»'■»,;- v) 



ni 



('{ir w T - w„T), 



a in 



/■/(ir;.v) 



1 + ^(ir). 



r - 0. 
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and by (11-202) 

1 ,T0 El 
/2 = -£FG det lF g} 

As in (11 -203), the 2n elements of the row vector E are 

and as in (1 1-204) the In elements of the column vector F are 

Fk = y — Zz i < k < 2«, 

{exp w,,,!, 1 < k < n, 
1, n + I < k <2n. 

Once the coefficients ct w of the signal have been calculated by, for instance, taking 
the w m as multiples of l^/T and evaluating the Fourier transform of S(t) by means 
of a fast Fourier transform algorithm, the probability-generating function in (12-19) 
can be computed at points on the contours of integration in (12-11) and (12-12) by 
standard routines for solving algebraic equations and evaluating determinants. 

Alternatively, one can apply (11-130) to calculating the Fredholm determinant 

by T 

In D(x) - m{i, % x; t) di, 
h 

where m(j, t; x; t) is the solution of the integral equation 

m(j, t; x; t) + ^J" u '> x; t)<W« - t) du = ^<|)(t - t), < t < t < T, 

corresponding to (1 1-132) with r = t. Furthermore, as in (1 1-210) the function J(x) 
can be determined by 

J(x) = f r [5'( T ) - S*(r, jc!)]ES(t) - S(t; x)) </t, 
Jo 

where 

S(r; x) = m(T, u; x; r)S(u) du 
Jo 

is the causal quasi-estimate of the coherent signal S(t) as in (1 1-209). Like the Fred- 
holm determinant D(x), that quasi-estimate can be computed by solving Kalman- 
Bucy equations as described in Sees. 11.4 and 11.5. These methods have been used 
in [Hel87j to calculate photocount distributions. 

12.2 DBTECTABILITY OF OPTICAL SIGNALS 
BY PMOTOCOUNTING 

12.2.1 Negligible Background Light 

According to the Planck law, thermal background light contributes an average num- 
ber _ 1 

«=^rTl 02-31) 
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of photons to each temporal mode at frequency/ hertz, where k - 1.38 ■ 10" 23 J/K 
■is Boltzmann's constant and T is the absolute temperature of the background 
radiation; hf is again the energy per photon of the light. At the wavelength 
X - 5890 A = 0.589 \x.m, which is that of the chief line in the spectrum of sodium, 
the quantity hf/k equals 2.446 • JO 4 K. Any background at an effective tempera- 
ture much less than this will contribute a negligible number of photons and hence 
eject a negligible average number t\'N of photoelectrons. If our receiver is far out 
in space, therefore, and not pointed toward the sun or any other intensely radiating 
body, we can set « - under hypothesis H Q that no signal is present. 

The receiver will thereupon decide for hypothesis H\ whenever any photoelec- 
trons at all are counted. In order to attain a preassigned false-alarm probability 
go, it must utilize a randomized strategy as described in Sec. 1.2.5. When no pho- 
toelectrons at all arc counted, it chooses hypothesis H\ with probability Q . The 
probability of detection is then 

Qd = 2oPr(/! - 1 //,) + Pr(w > 0j //,) 

= 1 -d -0))Pr(fl =01//,)- ° 2 " 32) " 

If /?](r) is the probability-generating function of the number n of electrons under 
hypothesis Hi, 

&/ = 1 - d -Qo)Ai(O) (12-33) 
by (12-8). For false-alarm probabilities Q less than 10~ 4 or so, the term Q in 
(12-33) can be neglected, and for Q d greater than about 0. 1 the detection probability 
is nearly independent of the false-alarm probability, which may as well be set equal 
to zero. 

We first consider the detection of incoherent light, S(t) = 0, in the absence of 
thermal background. If the light has a Lorentz spectral density as in (12-24), we 
find from {1 1-128) that 

where m = p,T is the time -bandwidth product and n, is the expected number of pho- 
toelectrons ejected by the signal during the observation interval (0, T), In Fig. 12-1 
we have plotted this probability of detection for a number of values of m - \lT. 
The larger the effective number m of temporal modes among which the light is di- 
vided, the greater is the probability of detecting it. The curve m = oo represents the 
probability 

Qd = 1 - exp(-n,.) (12-34) 

and also represents the probability of detecting a coherent optical signal Re S(t) exp 
iCli in the absence of background light. 

For incoherent signal light with an arbitrary spectral density, the probability 
of detection is, by (12-16) with r = 0, Q « 1, 

Qd = i -n< ! + «At-r', 

i. 

where \ k are the eigenvalues of (12-15). When the light is incoherent and has the 
rectangular spectral density (12-26), as in (11-110), 
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Figure 12-1. Detection probability Q,i of incoherent light with a Lorentz spectral 
density versus the expected number n f of photoelectrons ejected by the signal. The 
curves are indexed by the time-bandwidth product m = fiT". 




2c 



the \'k being the eigenvalues tabulated in [Sle65b], where the parameter c equals 
ttWT/2. In Fig. 12-2 we have plotted the probability of detection for various values 
of WT. As we said at the beginning of Sec. 11.2.7, the bandwidth W as defined 
by (11-93) equals the parameter jj, in the Lorentz spectral density. By comparing 
Figs. 12-1 and 12-2 we see that for equal values of WT, the signal with a Lorentz 
spectral density is the more easily detected. 

12.2.2 Detection of a Coherent Signal 

When a coherent signal s(i) - Re 5(0 exp i£lt, such as the output of an ideal laser, 
is to be detected in the presence of background radiation admitted in a spectral 
band of width W hertz, the receiver must decide between two hypotheses, (Ho) 
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Figure 12-2. Detection probability Q (t of incoherent light with a rectangular spec- 
tral density versus the expected number n, v of photoelcctrons ejected by the signal. 
The curves are indexed by the time-bandwidth product WT. 

the number n of photoelectrons counted during the observation interval (0, T) has 
the hypergeometric distribution in (12-30), or (H\) the number n obeys the Laguerre 
distribution in (12-29). There M = WT is the time-bandwidth product, N = njWT 
is the expected number of photoelectrons ejected by the background light in each 
temporal mode, and as in (12-5) n x is the expected number ejected by the signal. 

Randomization is necessary if the receiver is to attain a preassigned false-alarm 
probability Qq, It decides for hypothesis H\ when the number n exceeds a decision 
level « ; when n = n G , it chooses hypothesis //| with probability /. These parameters 
are determined by (1-57) with the probabilities in (12-30), and the probability Q f i 
of detection is calculated from (1-58) with those in (12-29). In Fig. 12-3 we have 
plotted as solid lines the detection probability for N - 0.1, <2o = I0~ 6 , versus the 
quantity 




which measures the strength of the signal. The dashed line (H) represents the detec- 
tion probability attained by the heterodyne receiver to be described in Sec. 12.4. 
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Figure 12-3. Detection probability of a coherent signal in background light that 
provides N photoelectrons per temporal mode versus the signal-to-noise ratio D = 
[7H s /{\ + N)] i/2 , Go = 10" 6 , N = 0.1. The curves are indexed by the number 
M = WT of temporal modes of the background. The dashed line (H) gives the 
detection probability attained by a heterodyne detector. 

12.2.3 The Poisson Limit 

When the time-bandwidth product WT is very large, even though the expected num- 
ber of photoelectrons ejected by the background light in each temporal mode may 
be very small, the total expected number n Q of these may be substantial. Whether 
the signal is incoherent (Gaussian) light spread over a broad band of frequencies or 
completely coherent light, like the output of a perfect laser, the receiver must then 
choose between two hypotheses: 

(Hq): Only background light is present, and the probability of k counts is 

k 

Pr(« = k\ H Q ) = 22- exp(-« ), (12-35) 
k\ 

or 
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(H\): Background light and a signal are both present, and the probability of k 
counts is 

Pr(« = k\ Hi) - ^ expf-mO, m , = « + (12-36) 

where n f is the expected number of photoelectrons ejected by the signal alone. We 
call this the Poisson limit. The number 7j may represent the expected number of 
counts resulting from dark current in the detector, that is, from stray electrons ejected 
by thermal agitation of the cathode, background radioactivity, or the like. 

The optimum treatment of the datum n requires randomization as described in 
Sec. 1.2.5 in order to attain a preassigned false-alarm probability Qo. By applying 
(1-57) and (1-58) one can compute the probability Q d of detection. One obtains 
curves like those in Fig. 12-4, for which Q = 10~ 6 and in which the curves are 
indexed by the expected number » of electrons under hypothesis H . Even a very 
small average number n of these significantly increases the expected number n s of 
electrons that must be ejected by the signal in order to attain a detection probability 
of the order of 0.99 or greater. 



12.3 PHOTOMULTIPLICATION AND SHOT NOISE 

12.3.1 Single-stage and Multistage Photomultipliers 

In detecting very weak light signals there is a danger that the small number of 
photoelectrons ejected may be lost amid the noise in the external circuit that is 
to count them. Figure 12-5 shows the effect of noise added after a photoelectric 
detector with no background light incident upon it. The Poisson limit WT » l was 
assumed. The photoelectrons ejected into the external circuit of the detector might, 
for instance, be stored on a capacitor, to whose voltage a subsequent amplifier adds 
Gaussian noise. In effect, then, hypothesis #, is chosen when the random variable 
v ~ n + x exceeds a 'decision level v Q . Under both hypotheses x is a Gaussian 
random variable with expected value zero and variance tr 2 . Under H\ n has a 
Poisson distribution with expected value under H n s 0. The decision level v 
was selected to yield a false-alarm probability 

Qo = erf(0) = l0- ( \ 

The curves in Fig. 12-5 are indexed by the standard deviation tr of the noise. 

The moment-generating function of the random variable v under hypothesis 

H\ is 

h v {z) = E(e~=<» +X >\ H t ) 

= MOM,), 02 - 37) 

where h„( • ) is the probability -generating function of the number n of photoelectrons. 
Here 

h v {z) = exp[n,(t'--- - l) + Vr 2 ] 
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Figure 12-4. Detection probability Qj of light in the Poisson limit versus the ex- 
pected number n s of electrons ejected by the signal. The curves are indexed by 
the expected number hq of electrons under hypothesis Hq that the signal is absent. 

Qo = lO" 6 - 



by (12-13). The probability Qj of detection can be computed by saddlepoint inte- 
gration of (5-19) or (5-20), depending on whether the decision level v Q lies below or 
above the expected value n s of v. Alternatively, the series to be derived in Problem 
12-6 can be utilized. 

This noise can be overcome by amplifying the number of photoelectrons by 
photomultiplication. In a typical photomultiplier primary electrons ejected by the 
incident light are accelerated by an applied voltage and strike an electrode, or 
dynode, from which they eject secondary electrons. These are accelerated by a fur- 
ther voltage drop and strike a second dynode, ejecting tertiary electrons. The process 
may continue through some number M of stages. The randomness of the number 
of secondary electrons per primary at each dynode in effect adds noise to the signal, 
but if the overall gain of the photomultiplier is large enough, the Gaussian noise 
represented by x can be overcome. 
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Figure 12-5. Detection probability Q tl of light ejecting electrons with a Poisson 
distribution in the absence of background light, but with the addition of Gaussian 
noise in the external circuit, versus the expected number n x of electrons generated 
by the signal; 0o = 1(T 6 . The curves are indexed with the standard deviation u of 
the additive noise. 



We consider first a single stage of multiplication. The number of secondary 
electrons ejected by each primary electron is a random variable. Denote by the 
probability that a single primary electron ejects j secondaries from the dynode. The 
number of primary electrons is also a random variable; let it,, be the probability that 
r of them strike the first dynode during the observation interval. The total number 
of output electrons is 

r 

n = Z 

k=\ 

where iy ( is the random number of secondary electrons ejected by the kth primary 
electron; the number r of terms in the sum is random. It is assumed that the 
numbers n k of secondaries ejected by the several primary electrons are statistically 
independent. 
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The probability distribution of the total number n of output electrons from a 
single stage is most easily calculated from its probability-generating function 

CO 

h(z) = E(z") = £ir,Wlr). 
The conditional expected value E{z n \ r) is equal to [g(z)] r , where 

CO 

S(z) = Y^Pj S) z j (12-38) 

is the probability-generating function of the number of secondary electrons per pri- 
mary. Hence 

00 

r=0 

where 

f(z) = j[ Vt z r (12-40) 

is the probability-generating function of the number of primary electrons ejected by 
the incident light. 

If there are a number M of dynodes instead of only one, we let gj(z) be 
the probability-generating function of the number of secondary electrons ejected by 
each electron that impinges on the jth dynode, j = 1, 2, ... , M. Let r}(z) be the 
probability-generating function of the number of electrons leaving the Mth (last) 
dynode when a single electron strikes the jth one. Then by (12-39) 

ImCO = Sa/(z), 

r/w tv (12_41) 

r,(2) = g/(r, + i(z)), 
A(2)=/(r,(z)). 

The (7; at the yth dynode is the expected number of electrons ejected from it 
per electron incident on it, 

and the overall gain of the device — that is, the expected number of electrons at the 
output for each primary photoelectron at the input — is 

M 

Go = f] °J- 02-42) 

Recurrent equations for the probabilities p ( „ JI that n electrons are ejected from 
the last dynode when a single electron strikes the yth can be developed by means of 
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formulas given by Rice for differentiating functions of functions [Ric68, p. 1998], In 
(12-41) 

CO 

r,co = XrfV, 

and by (12-9) and Rice's procedure 



1 j!1 

==o n\ dz" 



: = 



where 



d k 



(12-43) 

u -Pa 



and the array of coefficients b l ^ k is determined from the probabilities pf" by the 
recurrent relations 



In particular 



[ (y+i)]" 



II II 



«! 



These relations can be programmed for a computer. (The bj k are Rice's c„ yk divided 
by «!.) Furthermore, 

PO ~ r y (0) = g/fa) ), PQ = gfif(0), 

by (12-41) with 2=0. One starts with j ~ M and works down to j - 0, taking 
*o(z) = /(*). 

If, for instance, the single-stage distribution is Poisson with gain G at each 

stage, 

ft W = ^" G , &(*) = i 01 --", (12-44) 

/c! 

and (12-43) becomes 

g f = G *^> = e * exp[G(^ + " - 1)]. 

Recurrent relations for this Poisson single-stage distribution were given by Lombard 
and Martin [Lom61]. For the Polya distribution, or generalized hypergeometric 
distribution, whose probability-generating function is 

g{z) = [\-bG(z -\)}- l/b , 

the recurrent relations were worked out by Prescott [Pre66]. 

If the gains or the number of stages or both are large, the expected num- 
ber of electrons emerging from the photomultiplier will be very large, and these 
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recurrence methods become lengthy and cumbersome, requiring great amounts of 
computer memory to store the probabilities p 1 ^ and the array of coefficients bfy 
from stage to stage. Round-off error corrupts the results, particularly in comput- 
ing the complementary cumulative distribution (12-12) for k > E(n) = m\Ga, 
where m\ is the expected number of primary electrons ejected by the incident light. 
The method of saddlepoint integration has been found efficacious under these cir- 
cumstances [Hel84b]. Its application to a single stage of photomultiplication and a 
treatment of recurrence methods are to be found in [Hel84a]. 

How a single stage of photomultiplication overcomes the noise in the external 
circuit is illustrated in Fig. 12-6. Background light is absent; «o = 0. The number of 
primary photoelectrons under hypothesis H] has a Poisson distribution with expected 
value n s> and as in (12-44) the number of secondaries per primary has a Poisson 
distribution with expected value G, the gain. The number of electrons counted 
at the output during (0, T) then has a Neyman type A distribution [Tei81]. The 
probability-generating function of the number n of output electrons is 

h„(z) = Qxp{n 5 [e G ^~ l) - 1]} (12-45) 

by (12-39), and the probabilities pk of counting k output electrons obey Ney man's 
recurrent relations 

po - exp[-«j(l - e~ G )l 

n s Ge~ G 4- G r , ^ rt 

p*+i = ~YTT 2- >T^- r ' k - 0} 

r-Q 

[Ney39], [Hel91, p. 269]. 

To the number n of output electrons is added Gaussian random noise x with 
expected value and variance a 2 as in our example at the beginning of this section. 
The receiver chooses hypothesis H\ when the sum n + x exceeds a decision level 
vq set for a preassigned false-alarm probability Qq; as before, <2o = erfc(i>o/<r). The 
moment-generating function of the sum is determined as in (12-37). In Fig. 12-6 we 
plot versus the gain G the expected numbers n s of input photoelectrons required to 
attain two values of the probability Q<t of detection; Q Q = 10~ 6 . The performance 
of the simple photoelectric counter assumed in Fig. 12-5 is surpassed for gains only 
a little larger than 1 . 

As the gain G increases, the expected number n s approaches the value deter- 
mined by (12-34), 

n 5 ^ ~\n(\ - Q d ). 

When G » 1, a false dismissal occurs only when the number n of output electrons 
is zero, and 

p = ?r(n - 0) « exp(-n y ). 

The rest of the distribution of n is clustered around its expected value Gn s » 1 . 
For 1 < k < vq, Pr(» = k) «: />o, and 

Q\ - 1 - Qd » po Pr(* < fo) = />o(l - Qo) ~ Po- 
The number n s was determined by the secant method applied to the false- 
dismissal probability 

Qi(fi s ) = Pr(« + x < v \ Hi), 
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Figure 12-6. Expected number n s of input pholoelcctrons to attain a fixed prob- 
ability Q,i of detection with a single stage of phoioimiltiplieation versus the gain 
G of that singe. To the output number Gaussian noise of variance it 2 is added. 
Solid curves: Q,, = 0.9; dashed curves: 0,, = 0.999; Qo = 10"* Curves are in- 
dexed with the standard deviation cr of the noise. 

which during the first part of the search was calculated by the saddlepoint approxi- 
mation (5-53). After that, Q\ was computed by numerical integration of (5-19) along 
a straight vertical path through the saddlepoint. 

12.3.2 Avalanche Photodiodes 

A somewhat different type of photomultipiier is the avalanche photodiode. Light 
striking it on one face causes the injection of photoclectrons into the body of the 
device. Under an applied voltage these tire accelerated to the point where they create 
pairs of holes and electrons by collision with the atoms of the material. These new 
electrons and holes are themselves accelerated and create further hole --electron pairs. 
The number n of electrons at the output face of the device during the observation 
interval (0, 7") is counted, and if it exceeds a certain decision level, the receiver 
decides that a light signal was present at the input. 

The number n of electrons produced at the output by a single photoelectron at 
the input to the device is called the random avalanche gain. The distribution of this 
number depends on the ratio K between the probability that a hole creates a hole- 
electron pair by collision and the probability that an electron does so; < K < 1. 
Under the assumption that this ratio K is independent of the energy of the hole or 
the electron, Personick [Pcr71] showed that the probability-generating function M(z) 
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of the number n of output electrons per incident primary electron is given by the 
implicit equation 

z = M[l + a{M - X)l\ b = > 

(12-46) 

M(z) = £ Pr{«| \)z\ 

where the parameter a is related to the gain G of the photodiode by 

G = - — - — -, <a < 1 -K. 
1 - 

At the same time, Mclntyre [McI72] had worked out the probabilities Pr(«| r) that 
n electrons appear at the output when r primary electrons are injected at the input; 
the probability-generating function of these is 

[M(z)Y = f^Pr(n\r)z". 

Only later was it shown that Mclntyre's probabilities indeed follow from Personick's 
probability-generating function [Bal76], When the probability-generating function 
of the number of primary electrons is f(z), as in (12-40), that of the total number 
of output electrons counted in (0, T) is 

00 

h(z) = J^p r z r =f(M(z)). (12-47) 

r=0 

For Poisson input probabilities Conradi [Con72] calculated the output probabilities 

CO 

Because of the complexity of the formulas for Pr(w| /*), this computation and that of 
the cumulative probabilities g'~' and q l *> can be cumbersome, particularly for large 
numbers n of output electrons. 

Direct use of the probability-generating function M(z), with (12-47), in numer- 
ical integration of (12-11) and (12-12) for these cumulative probabilities is difficult 
because of the necessity of solving (12-46) for M{z) at each point on the contour 
of integration. When calculating the cumulative distribution of the number of out- 
put electrons, it has been found advantageous instead to use (12-46) to change the 
variable of integration in the contour integral from z to M. This transformation is 
single-valued provided one cuts the z -plane along the positive Re 2 -axis to the right 
of the point x* given by 

x* = Af*[l + a{M* - l)]~ b > 1, M* = > 1. (12-48) 

One can then transform the contours C- and C+ of integration into parabolic con- 
tours passing through saddlepoints and M$ on the Re Af-axis: 

< < 1, 1 < M£ < M*. 
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Details can be found in [Hei84c] and in [Hel88a]. This method has proved quite 
efficient for calculating the cumulative output distributions even for large numbers 
of electrons at the output of the photodiode. 

In more complicated situations, as when Gaussian noise is added to the output 
or when, as we shall see in Sec. 12.3.3, it is the output current, rather than the 
number of output electrons, whose distribution is sought, what one needs is M{e~") 
as in (12-37). It is necessary to solve 

-z = In M - b In [1 + a{M - 1)] (12-49) 
for M. In the neighborhood of the point M = I, z = 0, one can use the power series 

CO 

M{e~ z ) = 1 + X Ul < u = In x*, 

whose coefficients are proportional to the moments of the random avalanche gain. 
They can be computed by the recurrent relations [Hau78J 



C ll + l 



G 



n + 1 



(a + l)o, + a£ Mi-r + a{b - 1) J> + l)c, +]£W 



c=0 



n > 1. 

The recurrences begin with 

C] = G, c 2 = ^G 3 (l - a 2 b). 

A series that is valid everywhere except on the cut in the z -plane extending 
from -2* to -co is 

2M} 



M{e~ : ) - M* - £ e k t k+ \ t = [i\z + z,)\ ]/1 , V = 
■ whose coefficients are generated by the recurrence 



K ' 



e„ = 



1 



n + 2 



1 - K 

Mi 



a-] 



[5,-2 ~ (b + l)M« e|H ] - J> + \)e r e^ r 



it 

S„ = Y e > e ><->-< 



i=0 



starting with 



eo = 1, 



2 6+1 M* 



3 b - \ r 

[HeI92c]. Far from the branch point -z, the number of terms one needs to include 
in this series becomes excessive, and one must resort to Newton's method in order 
to solve (12-49). 

12.3.3 Shot Noise 

The numbers of electrons at the output of a photodetector are seldom counted 
directly. In most detectors they induce a current J(t) in an external circuit, and 



Sec. 12.3 Photomultiplication and Shot Noise 



493 



that current, possibly after amplification, is what is measured. In binary detection, 
hypothesis H\ that a signal is present is selected when the value of the current 
exceeds a certain decision level. We turn, therefore, to an introduction to how one 
can compute the distribution of the current at the output of a photodetector. 
The output current has the form 

CO 

y(0= X /('" t ™). - < 12 - 5 °) 

ln = — oo 

where t„, is the time at which the mth photoelectron crosses from cathode to anode, 
and /(/) is the current pulse it induces in the external circuit. The shape and width 
off(t) depend on the effective resistance, inductance, and capacitance of that circuit. 
We shall normalize the output pulses so that 

/•OO 

f(t)dt = 1. 

J— 00 

The dimension of the current j(t) is then electrons per second. By multiplying by 
the electronic charge q = 1.6 ■ I0~ 19 coulombs, one obtains the current in amperes. 

The times t,„ form a Poisson point process with rate \{t) given by (12-3). The 
probability that one electron is emitted during a brief interval 5/ is X8/, and the 
numbers of emissions in disjoint intervals are statistically independent. We seek the 
conditional moment-generating function 

h(z) = Eexp[-;j(t)) (12-51) 

of the current j(t) at time t, as in (12-50). The expectation E is taken with respect 
to the random times t wi , and the rate \{t) is assumed given. 

To this end we divide the time into intervals 8/ so brief that the probability that 
more than one electron is emitted during any one interval is negligible. We define 
the random variable e„, as the number of electrons emitted during the mih interval. 
Then 

Pr( e ,„ = 0) = 1 -M'« r )8/, 
Pr< e „, - 1) - k{t ni )ht, 

with t„, = m§t; the e,„'s are statistically independent. 
We approximate the current by 

7(0 = X<WX' "'"J' (12-52) 

in 

as though the electrons were emitted — if at all — only at the beginning of the intervals. 
As ht decreases, j{t) approaches the true current /'(/). Conditioned on a given 
emission rate \(t), its moment- generating function is 
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h{=) = £cxp[--./(0] 

= E exp[-r£ s IH fit - /,„)] 
m 

~ Yl E cxp[-r «,„/(/ - /,„)] 

m 

= Y]{ ] - M'"'* 8 ' + M/„,)8/ exp[-z/(/ - /,„)]} (l2o3} 

« exp[X(/ w )5?{cxp[-r/(/ - /,„)] - l)] 

- exp £ M>"5/){ exp[-=/'(/ - mhi)] - !}&/. 

In I he fifth line we used the approximation 1 + x » i? A ' for .v « 1 in order to 
introduce the exponential function. When wc pass to the limit 5/ — » 0, this moment- 
generating function becomes 



= cxp 



MT){exp[-r/(/ -t)]- \}<h 



(12-54) 



Inversion of this moment-generating function to obtain the probability distri- 
bution of the current /(/) is. a difficult problem when the expected number 

\(t) ch 

of electrons is small; see [Yuc78]. For 77 greater than about 10, saddlepoint integration 
as in (5-19) and (5-20) becomes feasible. In most cases the integral in the exponent 
of (12-54) must be evaluated numerically at each point on the contour of integration. 
Tlie path of steepest descent of the integrand has an infinite number of hairpinlikc 
branches, opening to the right and going off to infinity, one above the other. Each 
passes through a saddlepoint of the integrand. Only when 77 » I does it suffice to 
calculate the contributions only of the main branch cutting the Re --axis and one 
or two adjacent branches. Computations seem to have been limited to shot noise 
for which the rate X(t) is constant and for output pulses ./'(/) of simple shapes such 
as a triangle or a half-cycle of a sine wave, restricted to a finite interval outside of 
which they vanish. 



When ./'(/) is a rectangle. 



< / < r. 



to, / < o, / > r, 

h{:) = exp || ^\( T )|exp(-^ - 1 j </tJ = 7i(l~ :/t ), 

where /?(-) is the probability-generating function of the number n of electrons 
counted during the past T seconds, as defined in (12-14). 

When the rate MO is itself a random process, as in (12-3) with V{1) a circular- 
complex Gaussian random process, one must average (12-54) with respect to the 
distributions of K(t). The moment-generating function h(=) then has a form similar 
to that of the rectified output of a quadratic detector, (11-66), and to calculate it 



Sec. 12.3 Photornu Implication and Shot Noise 



495 



the methods described in Sec. 11.2.1 and in [Hel86b] can be tried. Extensive numer- 
ical computation will in any case be required in order to evaluate the probability 
distribution of the output current j{t). 

When the current j(t) results from electrons generated in a photomultipher, we 
replace (12-52) by 

J(t) = £ e m n„f(t - t m ), (12-55) 

m 

where n,„ is the random number of electrons produced at the output of the device 
when a single primary electron is injected at its input. It is assumed that these n m 
output electrons are produced in a burst that is much shorter than the duration of 
the output current pulse /(■)• The numbers n m generated by each primary electron 
are furthermore assumed to be statistically independent. This requires the incident 
light to be so weak that the secondary electrons hardly perturb the electromagnetic 
fields within the device and do not inhibit the generation or passage of subsequent 
secondary electrons. 

When the approximate current j(t) is given by (12-55), we replace e m by~e m n m 
in (12-53), and the fourth line of that equation becomes 

h(z) = f[ E{\ - \{t m )ht + \(t m )ht exp[~n m zf(t - *„)]}, 

m 

in which the expected value is taken with respect to the distribution of the random 
numbers n m of secondaries per primary electron. This averaging results in 

fa) = fH 1 " x <f»)8f + Kt^tMWit ~ t m % M(z) = Af(0, 

HI 

where M{z) is the moment-generating function and M(z) is the probability- 
generating function of the number of secondaries per primary. For a single-stage 
photomultipher, M{z) equals g{z) in (12-38); for multiple stages it equals Y M {z) in 
(12-41); and for an avalanche photodiode it is given by (12-46). Continuing as in 
(12-53), we find in the limit ht — » that the conditional moment-generating function 
of the output current is 

h(z) = £[exp(-z/(0)] = expjj" X( T )[Ji7(z/(/ - t)) - 1] rf T J. (12-56) 

Again, if X(t) is random, as in (12-3), this moment-generating function must be 
averaged with respect to the distributions of \(r). 

When subsequent amplification in effect adds Gaussian noise no(t) with ex- 
pected value zero and equivalent variance a 2 to the current j(t), the moment- 
generating function of the sum j(t) + nc(t) will be 

A f (2) = exp|| \(T)[M(zf(t -t))-1]</t+ £oV|. 

The values of this moment-generating function along a path of integration must 
usually be computed by numerical integration of the exponent in (12-56). When 
calculating error probabilities of optical receivers by saddlepoint integration, it is 
important to sketch the path of steepest descent through each saddlepoint in a few 
cases in order to determine a suitable approximation to it into which to deform the 
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Figure 12-7. Heterodyne receiver. 



contour of integration in (5-19) and (5-20). An application to receivers embodying 
avalanche diodes in a fiber-optical communication system is to be found in [He)92c]. 

12.4 OPTICAL HETERODYNE AND HOMODYNE DETECTION 
12.4.1 The Heterodyne Receiver 

In an optical heterodyne receiver the incoming light, carrying the signal to be de- 
tected, is combined at a half-silvered mirror with a strong beam of coherent light from 
a laser, which might be termed the local oscillator (L.O.). As shown in Fig. 1 2-7, the 
sum of the two beams falls on a photoelectric cell, whose emissive surface acts much 
like a quadratic rectifier. In the interaction of the incident beam with the surface, 
shot noise is generated as a result of the random emission of photoelectrons. We 
shall study the effect of this noise on the detectability of the signal. 

The field of the local oscillator beam in the heterodyne receiver must precisely 
match that of the signal over the surface of the detector in both spatial phase and 
polarization; otherwise the effective signal-to-noise ratio will be diminished. This 
matching is most easily achieved at infrared frequencies, where the wavelength of 
the light is conveniently long. Some signal energy will be lost at the mirror. The 
more transparent this is made for the signal, the more transparent it will be Tor the 
local-oscillator beam, and the more powerful this beam must be in order for our 
subsequent approximations to be valid. If the beam is too strong, however, it may 
damage the mirror. The technical details of constructing a heterodyne receiver are 
treated in the literature [Kin78], 

Each photoelectron creates a pulse of current in the external circuit of the 
detector, shown in Fig. 1 2-7. The form /'(/) of this pulse depends on the resistance 
and the stray inductance and capacitance of the circuit. The total current in the 
circuit is, as in (12-50), 
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in which tj is the time at which the ah electron was emitted. As before, the times T; 
form a Poisson point process with rate 

MO = 1-nlKCOI 2 , . (12-57) 

where V(t) is the complex envelope of the incident light field and tj = r\'/hf, with 
if the quantum efficiency of the detector and hf the energy of a single quantum of 
the light. 

The complex envelope of the sum of the fields of the local-oscillator beam and 
of the incoming light, labeled "Signal" in Fig. 12-7, can be written 

V(t) = Lo exp(-i'uv) + Vi(t), (12-58) 

where Lq is the constant complex amplitude of the beam coming from the local 
oscillator, 

wo = O - i\.o. 

is the offset frequency, and V f (t) is the complex envelope of the incoming light. This 
light has been filtered to remove components of the background light outside the 
frequency band of the signal being sought. We assume that the local-oscillator beam 
is so strong that Hot » \ V,it)\. 

We first determine the joint moment-generating function of scaled samples 
J(t m )/\Lo\ of the current j(t) at times t\, t 2 , ... , t m , ... : 

A ( 2 i. z 2 z„„ ... ) = tfjexp^-j-i-j £ 

= 4 xp [~i^i : ? z '" /(im ~ Ti) ]}- 

If we compare this with what we obtained when we substituted (12-50) into (12-51), 
we see that our joint moment-generating function can be written down from h(z) as 
derived there by replacing zf{t - t) everywhere by 



Thus from (12-54), 



J77 X W('m - t). 



h{z u 2 2 , ... , z m> ... ) = exp I A( T )jexp|^ 7 ~- 7 £ z m f(t m - t)| - lj di 

= exp|£|K W p{exp[- 1 ^X,„/ fe - ]-l}* 

by (12-57). This joint moment-generating function is conditioned on a given input 
process Re V[t) exp i£lt. 

Under the assumption that |L | is large, we make a power-series expansion of 
the innermost exponential in the moment-generating function, obtaining 
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h(z u z 2 , ... , z m , ... ) = expj--^ f \V{t)\ 2 Y z»,f{t B , - /) dt 

I Z\Lq\ J_.cc *^ 

1*00 

+ 4|^P 1 1 V{i){2 ^ £ z >» z »f( ! »> - ')/(f» -t)dt (12-59) 

The leading term in \V{t)\ 2 , by (12-58), is 1-LoP, and we see that the term in 
(12-59) with the triple summation is therefore proportional to |iol _1 - In the limit 
\Lo\ — * oo this term and all those beyond it will vanish. Because only the terms 
linear and quadratic in the Laplace variables z m remain, the current j(t) in this limit 
is a Gaussian stochastic process. We can therefore use the results of Sees. 3.3 and 
3.4 to determine the optimum processing of the output of the photodetector and the • 
resultant false-alarm and detection probabilities. 

From (12-59) the expected value of the current is 

/■OO 

E[j(t)\ V{t\ < t' < T] = -J-n I V(s)\ 2 f« - s) ds, (12-60) 

J— CO 

conditionally on the complex envelope V(t) of the light field incident on the photo- 
cell, and its conditional autocovariance function is 

Cw[j(t\)>m)\ V(t'),0<t' <T] = 

(12-61) 



M W(s)\ 2 At ] - s )f({ 2 -s)ds, 

J-co 



which represents Campbell's theorem. 

We now assume that the bandwidth of the detection circuit is so much greater 
than that of the fluctuations of the complex envelope V(t) of the light field that 
we can replace/(f) by a delta function in (12-60). Then the expected value of the 
current is 

mj(t)]W(t%0< t' <T} = ^\V(t)\ 2 . (12-62) 
Because ]L I » 1^7(01. the dominant term in (12-61) is 

/■CO 

CovUOi), j(t 2 )) = {t\\Lq\ 2 Rh - s)f(h - s) ds 

J-co 

= 5-nlLol 2 J VmI 2 exp[/c_(r, - t 2 )) ^, 

where 



F(to) = 



/"(/)£•-'"' dt 



is the Fourier transform of the current pulse /(f) and is proportional to the transfer 
function of the external circuit. The spectral density of the current fluctuations is 
therefore 

4> sh M = ^\Lo\ 2 \F(«>)\ 2 . 
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Because the external circuit is assumed to have a broad passband, « 
\F(0)\ = 1, and the unilateral spectral density of the shot-noise component of the 
current j(t) is 

;V sh = 2<E> sh (o) = t)\L q \ 2 . (12-63) 

The shot noise is effectively white. 

With |Io| » \ V;{t)\, the expected value of the current is, from (12-58) and 
(12-62), 

E[J(t)\ Vi(t'), < t> < T] = \t\\L \ 2 + -n ReflJ V-,{t) exp iw t], (12-64) 

The first term is biased out or absorbed into the decision level In (12-58) 

ViV) = S(t) + V b (t) t 

where S(t) is the complex envelope of a coherent signal to be detected, and V b {t) is 
the complex envelope of the background light field. Then the second term in (12-64) 
represents the combination of a coherent narrowband signal at the intermediate 
frequency wo with complex envelope 

S ir (0 = r]L* S(t) (12-65) 

and narrowband Gaussian noise Re W(t) exp iw G t with complex envelope 

W(t) = T]L* V b (t). 

The complex autocovariance function of this noise is 

\E[W(t x )W*{t 2 )] = -ryUof-PWi - h) 

by (12-1). This corresponds to narrowband noise with a narrowband spectral density 

*(<•>) = i-n 2 |Lol 2 ^(w) 

by (3-20) and (12-2), where $>(«) is the normalized spectral density of the background 
light. If as in (12-26) that light is assumed to have a uniform spectral density 
3>(u>) = W~ x over a band of angular frequencies, -ttW < to < taW, much broader 
than that occupied by the signal, the background contributes an additional white 
noise having a unilateral spectral density 

N b - t\ 2 \Lo\ 2 ~. (12-66) 

The signal Re Sjr(f) exp iwtf is thus to be detected in the presence of white 
Gaussian noise of unilateral spectral density + N b . As its phase is in general 
unknown, we are confronted with the same detection problem as that treated in 
Sec. 3.3. The voltage developed across the resistor in the external circuit of the 
photodetector will be applied to a filter matched to that signal, and its output will 
be passed to a rectifier, whose output at time t = T will be compared with a decision 
level set to attain a preassigned false-alarm probability Q Q , 

The probability of detecting the signal is, as in (3-75), 

Qd - Q(D> b) (12-67) 
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in terms of Marcum's Q function, where b is determined by the false-alarm proba- 
bility 

go = e~^\ 

and the effective signal-to-noise ratio D 2 is twice the energy of the signal S^{t) 
divided by the unilateral spectral density N s u +" Nb of the noise, 



1 f<X> 



Here 

/-co 

|5 if (0l 2 dt = r) 2 \L Q \ 2 I \S(t)\ 2 dt = 2-nlLoPn, 



by (12-65) and (12-5), where n s is the average number of photoelectrons that would 
be ejected by the coherent component of the incident light in the absence of the local 
oscillator. By (12-63) and (12-66), furthermore, 

*.h + N b = rilUl 2 + ^\U 2 ~ = rtliol 2 [l + J, 
with no = r\PT as in (12-4). Thus we can write the effective signal-to-noise ratio as 

The number N is the expected number of photoelectrons per temporal mode under 
those conditions, and it is given by t\'N, where N is specified by the Planck law 
(12-31) and r\' is the quantum efficiency. 

In Fig. 12-3 we plotted as a dashed line (H) the probability Q d of detecting 
a coherent narrowband light signal by the heterodyne receiver, as given by (12-67), 
for a false-alarm probability go - 10~ 6 . We took N = 0.1, and the abscissa is the 
effective signal-to-noise ratio D defined by (12-68). The solid lines represent the 
probability of detecting the same signal by counting the number of photoelectrons it 
ejects when the local-oscillator beam is turned off; they were calculated as described 
in Sec. 12.2.2. The curves are indexed by the effective number M = WT of temporal 
degrees of freedom. Only if the background light is admitted over a broad range of 
frequencies is a heterodyne receiver superior to one merely counting the number of 
photoelectrons directly ejected by the light. 

Figure 12-8 shows the gain achieved by using a heterodyne detector rather than 
a photoelectric counter; the gain is defined as the ratio of the expected number n s 
of photoelectrons needed by the photoelectric counter to attain a given probability 
Qd of detection to the expected number Jt s ' ' needed by the heterodyne receiver, for 
the same false-alarm probability Q . Here, by (12-68), 

nl H) - \D 2 {\ + N), 

where D 2 is the signal-to-noise ratio specified by (12-67). The gain is plotted against 
the time-bandwidth product M = WT for four values of the expected number N of 
background photoelectrons per temporal mode. The smaller the expected number 
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Figure 12-8. Gain n s /n { " ] (dB) of input signal strength when using a heterodyne 
detector rather than a photoelectric counter versus the time-bandwidth product 
M = WT\ Q (l = 0.99, go = 10" 6 . Curves are indexed with the expected number 
N of photoelectrons per temporal mode. 



N is, the greater must the time-bandwidth product WT be before it is advantageous 
to use a heterodyne detector. 

In plotting the curves in Fig. 12-8, we began by determining by (1-57) the 
combination of the decision level n Q on the number n of photoelectrons and the 
randomization fraction / that yields the preassigned false-alarm probability go for 
the hypergeometric distribution in (12-30). The number n s required to attain the 
probability Q d of detection with the photoelectric counter was first calculated by the 
saddlepoint approximation to (12-11), in which the probability-generating function 
h(z) is given by (12-27). By eliminating the parameter n s from the equations for 
the "phase" an d its first two derivatives, it was possible to apply the secant method 
to search for the saddlepoint zq 1 yielding the desired value of Q\ = 1 - Q d and 
thence to calculate an approximate value of n s . With this as a starting value, the 
secant method was then used with the exact Laguerre distribution (12-29) in (1-58) 
to compute the expected number n s yielding the prescribed detection probability Qj. 

In this analysis we have assumed the ideal photoelectric detector shown in 
Fig. 12-7. In a practical heterodyne receiver it may be necessary to use photosensitive 
solid-state devices such as avalanche photodiodes. These will contribute additional 
noise that must be taken into account in assessing the performance of the receiver. 
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72.4.2 The Homodyne Receiver 

If we know the frequency and the phase of the signal precisely—and this knowledge 
is much more difficult to acquire at optical than at radio frequencies— wc can set 
the local oscillator beam at the same frequency and phase. The receiver is then 
termed homodyne. When L n and the complex envelope 5(0 of the signal are real, 
the conditional expected value of the current from the detector will be 

E[j(t)} = l-njX.,,1 2 + i)Lo[S(0 + Re V h (t)] 

from (12-64), where s(t) = r\L Q S(t). Here n(t) = -qL Re V b (t) = -x\UV bx {t) is 
Gaussian noise with autocovariance function 

$b{h - h) = £[n(/i)«(? 2 )] 

= ^^^[^-('O^vt^)] = rcLlP Re cj>(/, - t 2 ) 

by (3-34) and (12-1). Taking ${t) as real, we find for the spectral density of the 
noise due to the background light 

W 

when as before the background light has a uniform spectral density over a band of 
width W hertz. This corresponds to white noise of unilateral spectra! density 

K = = 2^|L = 2^A/, (12-70) 

where as before « = r^T is the total average number of photoelectrons thai would 
be ejected if the background light fell directly onto the detector, and N is as in 
(12-68). The probabilities of false alarm and detection are now, as in (2-72), 

Qo = erfcv, Q d = erfc(_v - /)), 

with the effective signal-to-noise ratio given by 

D = MO]" * = - (12-71) 



Here 



[j(f)J 2 rf/ - TfLf, |5(/)| 2 rf/ = 

O J— CO 

by (12-5), and for /V sh and Afc' we have substituted (12-63) and (12-70), respectively. 
Again, is the expected number of photoelectrons that would be ejected by the 
coherent component of the incident light if it fell directly onto the photodetector. 

Comparing (12-68) and (12-71), we see that in the classical limit, N » 1, both 
signal-to-noise ratios approach the same value 2n x /N, and by the Planck law (12-31), 
with kT » hj\ this ratio becomes 

~* N ~* i\hf{kT/hf) kT ' 
where E s is the energy of the incident signal, k is Boltzmann's constant, and T is the 
effective absolute temperature of the background radiation. This is the same as the 
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maximum attainable signal-to-noise ratio deduced in Sec. 2.4. In the quantum limit 
N — * 0, on the other hand, the homodyne receiver provides twice as large an effective 
signal-to-noise ratio as the heterodyne receiver. The l's in the denominators of (12-68) 
and (12-71) arise from the irreducible quantum fluctuations of the electromagnetic 



f 2.5 DETECTION BASED ON COUNTING TIMES 

In the previous sections it was assumed that the decision about the presence or ab- 
sence of a signal is based only on the total number n of photoelectrons counted 
during the observation interval (0, T). The question arises whether a greater proba- 
bility of detection could be attained by observing the times at which the individual 
electrons are ejected by the incident light. In general this must be the case, for if 
one discarded that information and used only the total number of electrons, one 
would expect on the basis of our discussion in Chapter I to suffer a loss of signal 
detectability. To calculate just how great the loss is, however, is extremely difficult. 

Conditioned on a particular realization of the incident field v(t) = Re V{t) exp 
iSlt, the probability that one electron is emitted during each of the m subintervals {t„ 
U + dt), I < i < m, and none during the rest of the observation interval (0, T), is 

Pr(f/ < t, < t s + dt,\ <i < m| V(t\ < t < T) 



with t\ = j\'/hf as in Sec. 12.1.1 [Bar69]. 

To show this, we divide the interval (0, T) into a large number of disjoint subin- 
tervals A*. The emission times t/ form a Poisson point process, and the probabilities 
that no electron and one electron is emitted during any subinterval are, respectively, 



Whether an electron is emitted in any subinterval is independent of whether one is 
emitted in any other subinterval. Hence we can multiply the probabilities (12-73) 
and (12-74) for the several subintervals, and letting At become infinitesimal and 
replacing it by dt, we obtain (12-72). 

This probability must be averaged over the ensemble of the processes V(t). 
The value of the likelihood ratio when m emissions have been observed to take place 
at times i\, t 2 , ... , t m and none at any other times is then 



field. 




(12-72) 



where t, is the time at which the /th electron is emitted, and as in (12-3) 



Pr[0 e's in (/, t + At)] = e\p[-k(t)At] , 
Pr[l e in (f, t + At)] = \(t)At exp[-\(/)A/] - 



(12-73) 
(12-74) 



A(t],t 2 , ... , 



^{n; =1 MT;)expH r MQ^]} 
^{njL, Xo(Tj)expH r \ (0^]}' 



(12-75) 
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where \,-(-) is the emission rate under hypothesis //,*, and E t indicates an average 
with respect to the ensemble of processes V{t) under that hypothesis, i ~ 0, 1. This 
likelihood ratio is compared with a decision level A , which if exceeded evokes the 
decision that H { is true and a signal is present. The value of Ao is determined by the 
false-alarm probability Q . In general the probability distributions of the statistic in 
(12-75) are difficult to compute under either hypothesis. 

12.5.1 The Poisson Limit 

Suppose that the background light falling on the photoelectrically emissive surface 
of the detector has a bandwidth W much greater than the reciprocal of the resolv- 
ing time of the device recording the emission times t,-. Then under hypothesis Hq 
the fluctuations in the rate are negligible, and we can take X (/) s A as constant. 
Suppose that the signal to be detected is the output 5(0 = Re S(t) exp Hit of an 
ideal laser. Then under hypothesis H\ we can take 

\,(0 = \o + H 5 <oi 2 

as nonrandom as well. The logarithm of the likelihood ratio in (12-75) is now 
In A(t u t 2 , ... , t,„) = £ In [^~] - jVi(0 - M du 

The statistic 

is thus sufficient for deciding about the presence or absence of the signal. If the 
signal envelope is constant, S(t) = Sq, the statistic g is simply proportional to the 
total number m of electrons counted in (0, T), and their emission times t/ provide 
no useful additional information. If the signal S{t) is variable, they do provide 
information, and one can anticipate that utilizing the statistic in (12-76) will enhance 
signal detectability. 

To calculate the performance of a receiver based on g of (12-76) is difficult. 
The moment-generating function of g under hypothesis H h i = 0, 1, can be worked 
out by the following analysis. Divide the interval (0, T) into a large number of brief 
subintervals (/*-], U), 1 < k < M. Then the statistic g is approximately 

ff = |"* ln [n[~]' '* = * A '' A ' = Ir (12 " 77) 

where is the number of photoelectrons emitted in the kth subinterval. For A* 
sufficiently small, is either or 1 with much greater probability than that n k > 1, 
and (12-77) reduces to (12-76). The numbers n k are independent Poisson-distributed 
random variables with expected values 

E(n k \ H Q ) = k Q At, E{n k \ tf,) = k x {t k )M. 
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The moment-generating function of the statistic g is then approximately 

For a Poisson random variable with expected value m, as in (12-13), 

E{z n ) = e m(z ~ l) . 

Replacing z by [Xi(/ A )Ao]~ z and m by X,(**)A/ in the kth factor in (12-78), we find 



»,<.).n-*p{«.>[[T~r "']«<} 



= exp 

and in the limit M -» oo, A* -» this becomes 
with 

Xo(0 = Xo, Xi(0 = X + ^|5(/)| 2 . 

These moment-generating functions cannot be inverted analytically to obtain 
the probability density functions of the statistic g unless \S(t)\ 2 is constant, where- 
upon, as we already know, the statistic g is proportional to a Poisson-distributed 
integer-valued random variable. Otherwise the false-alarm and detection probabili- 
ties might be computed by numerical integration of the contour integrals in (12-11) 
and (12-12), but it would be necessary to evaluate h { {z) at each point z on the contour 
of integration by integrating the exponent in (12-79) numerically. These moment- 
generating functions have the same form as that in (12-54) if there we put / = and 
/(-t) = ln[Xi(T)/Xo]. The same problems as attend calculating the distribution of 
shot noise will arise here. 

12.5.2 Incoherent Gaussian Light 

The probability density functions in the numerator and denominator of (12-75) are 

Mn.n, ... ,t„) = £|n^)exp[-|\KO^]|^/J, (12-80) 

(" = 0,1; E denotes an expectation with respect to the ensemble of stochastic pro- 
cesses Xj(/). Let us now suppose that under both hypotheses the complex envelope 
V(t) is a stationary circular complex Gaussian random process with expected value 
zero and autocovariance function P$(t - «) as in Sec. 12.1. The rate of emission 
of photoelectrons is 
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as in (12-3). We drop the subscripts referring to the hypothesis. How to evaluate 
the expectation in (12-80) has been described by Macchi [Mac71]; see also [Sny75, 
pp. 303- 309]. Here we present a perhaps simpler method. 
We denote the product in (12-80) by 



in order to find the expected value 



£= E 



R[V] exp[-[ o 



\{t) dt 



= £j*[K]exp[-^jV(Ol 2 rff]J 



(12-81) 



we express V(t) in terms of an orthonormal set of sampling functions/* (0, as before. 
Then R[V] depends on the V k 's in a way that we need not specify. . We evaluate 
(12-81) by multiplying the contents of the expectation by the joint probability density 
function of the real and imaginary parts of a finite number M of samples V k - 
Vkx + iVky in circular Gaussian form, and we then integrate over the entire 2M- 
dimensional space of V kx and V kVi 

E = (2^Y' VI \ det *("' •■• R(V) exp[-ii,V + V - AV + tf-'vj d M V kx d M V ky , 

J— CO J—oo 

where <t> is the complex autocovariance matrix of the samples V k , and we have put 
• r m 
\V(t)?dt = V \V k \ 2 = V + V, R[V] - R(\). 

We write this expected value E as 

•CO |*CO 

... R(V) exp[-iV + (T|I + <S>-')\}d M V kx d M V ky 

— CO J— CO 

r f « (12-82) 

- Mct^WdcT^j L-J_/^an})^ w K, v ,/%. ) 

where q{{V k }) is a circular Gaussian density function of samples V k with covariance 
matrix 

This matrix M is the solution of the matrix equation 

M + t)M0> = <J>. 

The dimension M of these matrices is now allowed to go to infinity. The auto- 
covariance function corresponding to the matrix <& is P$(f - «), and the function 
corresponding to the matrix M is the solution m(r, u; n) of the integral equation 



(2ir) iW |dct*l 
1 



" f 7 " 

?(t, «; n) + — m{t, v; n)$(v - if) dv - pMt ~ u), 
} h 

< (/,«) < T, 



(12-83) 
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with n = t)PT the expected number of photoelectrons under whichever hypothesis 
is involved. The factor in front of the second integral in (12-82) becomes 

det(I + T|*r' = [D(n)T\ n = t\PT, 

in terms of the Fredholm determinant D(x) defined in (12-17). Thus we can write 
the expectation in (12-82), and hence that in (12-80), as 



P(t u t 2 , ... ,t w ) 



where E q denotes an expectation with respect to an ensemble of circular complex 
Gaussian processes V(t) with autocovariance function m(t, u; n). When calculating 
such a probability density function under hypothesis Hq, one puts n equal to the 
expected number n of photoelectrons under that hypothesis; and when calculating 
it under hypothesis H], one puts n~n\ = n o + n s . 
In particular 



is the probability that no photoelectrons are ejected at all under the current hy- 
pothesis. This is the same as one obtains from (12-19) with z = 0, S(t) s 0, and 



no = n. 

Furthermore 



and by (3-44) 



D(n) 



V , 

D(n) 

with 

mij = m(T/, t/; n) 

in terms of the solution of (12-83). For the density functions of higher order we 
follow the rule stated after (3-44). For m - 3, for instance, 

•n 3 

P(ti, t 2 > t 3 ) = j^imnmxmx + myimnmyi + mnm^rm\ 

+ m u m 22 m M + m u m 2 3fn 32 + m\2m 2 \tn^) 

+ + 



3 r m u mi2 m u "I 3 

= mix m 2 2 m 2 i = _i— /> 3 ( m ). 

D{n) [ m3I m32 m33 J D(n) 



Here ^(m) is the 3 X 3 permanent of the matrix m = ||m,y||. It is evaluated like 
a determinant, but all terms have positive signs. Continuing thus, we see that the 
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general form for the probability density function of the counting times T| t 2 t„, 
in (12-80) is 

-in 

Pi^u T 2 , ■■■ , T m ) = P H) (m), 
D{n) 

with P m {m) the mXm permanent with elements m,y = m(T,, t,; n), 1 < < m. 

With these density functions evaluated under the two hypotheses, one can form 
the likelihood ratio A(tj, t 2 , ... , t,„) by dividing the density function under hypoth- 
esis H] by that under hypothesis H Q , where ti, t 2) ... , t,„ are the times at which 
photoelectrons have been observed to have been ejected by the incident light. If this 
likelihood ratio exceeds a certain decision level A , the receiver decides for hypoth- 
esis H\\ otherwise the null hypothesis H Q is accepted. The problem of calculating 
the false-alarm and detection probabilities for this kind of receiver has not, to the 
writer's knowledge, been solved. 

Problems 

12-1. In a binary pulse-position modulation system, digits and 1 are transmitted every 
T seconds. For a 0, a light pulse is received during the first half of the interval 
(0, T); for a 1, a pulse is received during the second half of the interval. The light 
signal and background radiation fall onto a photoelectric detector, which counts the 
numbers wi and n 2 of photoelectrons emitted during the intervals (0, \ T) and {\ T, T), 
respectively. Under hypothesis H that a has come in, the numbers /i[ and n 2 have 
Poisson distributions with expected values m, and m , respectively; m, > m . Under 
hypothesis H\ the expected values are reversed. The numbers n\ and n 2 are statistically 
independent, and the O's and l's are equally likely. When /i, > n 2 , the receiver decides 
for hypothesis H , and when n, < « 3 , for H\. When n { = n 2 , it chooses one or the 
other at random and with equal probabilities \. Calculate the probability of error in 
this receiver. Express the result in terms of Marcum's Q function. Hint: Express the 
probability Pr(n, - n 2 = k) in terms of the modified Bessel function of order k. 

12-2. In a pulse-position modulation (PPM) system transmitting M equally likely symbols 
every T seconds, the interval (0, T) is divided into M equal parts. For the jfcth 
symbol a light pulse is transmitted during the fcth subinterval, but nothing in any of 
the others. The receiver counts the numbers n u n 2 , ... , n M of photoelectrons emitted 
from a detector during each of the M subintervals. The receiver decides that the 
symbol that was transmitted was the one corresponding to the largest of those data. 
If a certain number r of the data are equal and exceed the rest, the receiver chooses 
at random and with probability \/r one of the r corresponding symbols. Each of the 
M random variables has a Poisson distribution with expected value m when no 
pulse arrives during the fcth subinterval and with expected value m\ when one does 
arrive; 1 < k < M. 

Show that the probability of error can be written as 

1 « ,.,p M-\ 

where 

^i-^ *-. = <>. 

7 = • / - 

This problem is a generalization of Problem 12-1 [Gag76, pp. 261-72], 
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12-3. Let the number n of photoelectrons have a Laguerre distribution as in (12-29). Set 
up the computation of the cumulative probability q™ and its complement q 1 ^ by nu- 
merical integration of (12-11) and (12-12). Show how to determine the saddlepoints 
through which the contour of integration should pass. Develop saddlepoint approx- 
imations for these cumulative probabilities as in Sec. 5.3.2. Compare the results of 
this approximation with the exact cumulative probabilities for M = WT - 20, n - 4, 
n t = 50, and r = 10(10)100. 

12-4. Show that the overall gain Go in the photomultiplier treated in Sec. 12.3.1 equals the 
product of the gains Gj at each dynode. 

12-5. In the photomultiplier treated in Sec. 12.3.1, assume that the probability that an 
incident electron ejects k secondary electrons from the yth dynode is 



for positive constants Calculate the probability generating function gj{z) and find 
the gain Gj of the y'th stage in terms of a } and Vj. Show that when a single primary 
electron strikes the first dynode, the number n of electrons at the output of the A/th 
stage has a distribution of the same form as this, but with different parameters. (Hint: 
Look up the tomographic transformation in a text on complex variables.) 

12-6. Consider the sum v of Poisson-distributed photoelectron counts n and a Gaussian 
random variable x representing the noise in a subsequent amplifier. Assume that n 
has expected value proportional to the strength of an incident light signal, and 
that x has expected value zero and variance ct 2 . Determine an infinite series for the 
probability Q ~ Pr(« + x > V) that the sum v of signal and noise exceeds a decision 
level V. Calculate the moment-generating function E[exp(-zv)] of v = n + x. Show 
how to calculate Q and its complement 1 - Q by integration along a vertical path in 
the complex z -plane through a saddlepoint on the Re j-axis, and show how to find 
the appropriate saddlepoint. Use the saddlepoint approximation to check a few of the 
values in Fig. 12-5 for Q d > 0.95, <j > 0. 

12-7. Show that the function M(z) defined by (12-46) must have a branch point at z = x t 
as given in (12-48). Hint: Sketch the curve of z versus M represented by (12-46) for 
real values of M. 

12-8. In (12-80) assume that the rate \, ( ■ ) is nonrandom. By integrating the density function 
given there over tj, t>, ... , t,„, show that the probability that m electrons are emitted 
during (0, T) has the Poisson form. Keep in' mind the constraint < ti < tt < ■•■ < 
t f „ < T. Hint: Try mathematical induction. 

12-9. Let the distribution of the number of secondary electrons per primary electron in a 
photomultiplier have the Poisson form in (12-44) with gain G per stage. Show that 
the probabilities that k electrons are ejected from the last dynode when a single 
electron strikes the y'th dynode can be calculated from the recurrent relations 




k > 0, 



> 0, 



< vj < 1, 




k > 0, 



beginning with 



G 



Pr 



AM) _ 



,-G 



r > 0, 



for 7 = M and working down toy' = 1 [Lom6I]. 
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Appendix A 



Solution of the Detection 
Integral Equations 



A.1 Inhomogeneous Equations 



Our first task is to solve the integral equation 



<|>(/ - !/)</(«) (hi = .*</), 



< / < 7\ 



(A-l) 



in which c[>(t) is a kernel that is the Fourier integral of a rational spectral density 
with simple poles: 



with C a positive constant. We take the u,/,. "s to have positive real parts. The 2n poles 
of the spectral density lie at i\x* k in the upper half-plane and at -/>/,- in the lower 
half-plane. I < k < n. This more general form of the kernel <|>(t) allows for complex 
autocovariance functions, of narrowband noise as in (3-52) and for applications in 
Chapters 1 1 and 12 where <l>(o>) contains a complex parameter. 

Decomposition of <J>(co) into partial fractions and Fourier transformation lead 
to the expressions 




n t/w-Py) 



(A-2) 



= C 



n s/oj - p. k 



511 



4>C - u) =/ 8(? - u) + exp[-^*(/ - u)l t > u, 

(A-3) 

= /o&O - ") + X % k exp[^jLjt(/ - m)], t < u, 
for the kernel of (A-l). This partial-fraction decomposition has the form 

*(«) =fo ■+ £ f-r^r- + — ^M. (A-4) 

When 7K < «,/ = 0. 

As in (2-100) we write the solution of (A-l) in the form 

q{t) = + ?/,(?), (A-5) 

n-m-l 2m 

= X [fl;6^(0 + 6y5^-r)] + ^ 9 exp^. (A-6) 

Here 5<%) is theyth derivative of the delta function, defined by 
ff(u)^\u -t)du = H)' f ^. 

The delta functions in (A-6) are assumed to stand just inside the interval (0, T), so 
that they contribute their full weight when substituted into (A-l). In particular, 

J o <M* - u)^\u -a)du= HY-^j J r <K' - «)S(w - a) du 

= 4><'>(/ - a), a = + , T~. 
The particular solution q M (t) in (A-5) is the solution of the integral equation 

*(0=f (A-8) 

It is the same as q*(t) in (2-77) and can be found as in (2-78) and (2-79) when the 
signal s(t) is differentiable 2(« - m) times, and in this appendix we shall assume 
that this is the case. When — as is most common — the input contains a white-noise 
component, this assumption is valid. If there is no white-noise component and 
the signal is not sufficiently differentiable, one must seek a particular solution by 
other methods or resort to the more complicated procedure presented by Slepian 
and Kadota [Sle69]. 

Substituting (A-8) into the right side of (A-l) and (A-5) into the left side, we 
obtain 

J o <K* - u)q h {u) du = |j + J <f,(; - «)?»<«) du. 
Then substituting from (A-6) and using (A-7) we find 

512 Solution of the Detection Integral Equations Appendix A 



r-/nH 2") /-7' 

^ [^4> 0) (/) + bj^\i - r)J + X +(' - w ) tJ0> " 

./ = () 7 = i ° 



(A-9) 



When we introduce (A-3), the first summation on the left side of (A-9) yields the 
terms 

n a -in -i 

These terms arise only if m < n, whereupon f, = 0. The second summation yields 
terms of the form 



exp[M> - u) + 6/«] du 



= Wo + 



A + gk 



k = \ L 



+ ate"*' 



These are summed over 1 <j <2n. Putting w = —ify into (A-4), we see that the 
first term vanishes, 

cj e *'M-ify) = 0, 

by (A-2). 

Substituting (A-3) into the right side of (A-9) produces the terms 

k=i 

L k = e^"q m (u)du, R k = e-^"~ T) q^u) du. (A- 10) 



where 



In order for the integral equation (A-l) to be satisfied, the terms proportional 
to exp(-|X/*0 and to expj^-ff - 7 1 )] must vanish individually, and that requirement 
produces 2n linear simultaneous equations 

II — "1—1 111! 

X ajir^tV - X n . = L kt 1 < k < n, (A-ll) 



7=0 



\ <k <n. 



(A- 12) 



By defining 



P-k- 



\ <k < n. 



we can write (A-l i) as 



-£*_„, n + 1 < A: < 2n, 



(A-l 3) 
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and the set of 2n simultaneous linear equations can be concisely written as 

« <={:] ■ [\] 
■it y 



where 

B = 



is a 2(« - m) x 2(« - m) block matrix in which the (&/)-element of block B 3 "is 

1 < k < n, and that of block B 2 is (-^)^ 3 = m4"J m . The O's are (n - m) X (n - m) 

matrices of zeros. The 2n X 2m matrix G = G(7") is 



ex P 3/ T 

1 



!</:<«, 
« + 1 < k < 2n. 



I <y < 2m, (A- 15) 



In (A-14) a is a column vector whose transpose is 

a r ~ (bo ... b„- m ~] — ao ... —a„- m -\), 

and c is the 2m-element column vector of the coefficients cj. There R and L are the 
n -element column vectors of the quantities R k and L k on the right sides of (A- 12) 
and (A-13), respectively. 

Solving these equations requires inverting the 2n X 2n matrix of the coefficients 
of the n - m a's and b's and the 2m c's. It is possible in general to eliminate the 
n - m a's and b's from (A-12) and (A-13) so as to reduce these to a system of only 
2m simultaneous equations for the c's [HeI65], but the terms of the new equations 
are rather more complicated than those in (A-12) and (A-13), and the subsequent 
calculation of the a's and b's is then cumbersome. Given a digital computer, it is 
more expeditious to let its program for solving linear simultaneous equations carry 
the burden of the complexities of solving (A-12) and (A-13), whose coefficients are 
relatively simple to program. As we saw in Chapter 2, the a's and b's do not appear 
when n ~ m. When m = 0, the c's are absent, and (A-12) and (A-13) can then be 
solved separately. 

If the signal s(t) is a sinusoid, 5(0 = exp iwt, the right sides of (A-12) and 
(A-13) reduce to 

~ Lk ~" ~ (p* - hvWw)' Rk - (w-mOW (A_16) 

Furthermore, by carrying out a discrete Fourier transform of the signal s(t) over the 
interval (0, 7"), one can approximate it as 



J=-M 

and one can compute the n L^s and R^s by 



T ' 
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T _ V -i R = V — 1 < k < n 

- k -tojY*<wY k jir^H-^m^y 

As an example, let us take the signal to be exp iwt with angular frequency 
w - let T = 1, and let the spectral density of the noise be 

( W 4 + 4)^2 + 4)" <• ; 

Now m = 1, n = 3, and 
C =24, p, - (§) ,/2 , p 2 = -Pi, 

Hi = 2, p, 2 = 1 + i, M-3 = 1 - = -2, M-s = -1 + i, (16 = -1 - 

Then we find the coefficients on the right side of (A-13) to have the values 

-L\ = -0.2682368 + 0.2106727*, 

-L 2 = -0.6542405 + 0.3734381/, 

-Z.3 = -0.1139964 + 0.2930614;, 

and those on the right side of (A- 12) are 

Ri = iL\* t R 2 = iL\, Ri = iLl 

the factor / arising because exp nvr = * in (A-16). Solving the six simultaneous 
linear equations (A-12) and (A-13), the computer gives us the coefficients 

a Q = 0.473772 - 0.478199/, 

a x = 0.0573014 -0.177352/, 

b = iaQ, b\ = -ia* t 

C) = 0.0861594 + 0.0916176/, c 2 = ic* . 

In general, indeed, when s(t) = exp /*w, 

= Zi e*" 7 ", 6; = i-\) j aj e iwT , c J+m = c* \<j <m, 

as one can show by substituting these into (A-1I) and (A-12) and comparing the 
results. Taking the imaginary part of our solution yields the result of the example 
in [Lan56, Sec. 8.4, pp. 309-29]. 

A.2 Homogeneous Equations and Their Eigenvalues 

We seek a method for computing the eigenvalues and eigenfunctions of the integral 
equation 

\f( t ) = f <K* - «)/(«) du, 0<t <T, (A- 18) 
Jo 

when 4>(t) = ¥ (~t) is an Hermitian kernel of the same type as in (A-3). We assume, 
without loss of generality, that m < n, so that/o = 0. 
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The same technique as in Sec. A.l is employed, but as if the kernel of the 
integral equation (A-l) were X5(/ - u) - <$>(t - u) and the right side were zero. The 
Fourier transform of the kernel, when cleared of fractions, is a rational function 
with numerator and denominator of the same degree In. Delta functions are then 
absent from the solution. From (A-6) we see that the eigenfunctions /(/) of (A- 18) 
have the form 

f ^ "H c j ex PlV> (A- 19) 

j=\ 

where by (A-2) the are the In roots of the algebraic equation that results from 
clearing the equation 

X = *(-ip), 3 = P;, 1 <j <2n, (A-20) 

of fractions. "When <J)(t) and X are both real, the roots y - either are real or occur in 
complex-conjugate pairs. 

Because s{t) is now zero, so is and in (A-12) and (A-l 3) L k = R k = 0. 

The terms a, and bj have disappeared. Thus (A-14), with m = n, becomes simply 
G(7> = 0, where G(T) is the matrix, now Inxln, whose elements are specified by 
(A-15). In order for nonzero solutions of these In homogeneous linear equations to 
exist, the determinant of their coefficients must vanish: 

det G(T) = 0. 

This equation is equivalent to 

det G(T) 

det G(0) ~ ' 
with the elements of G(0) given by (A-15) with T = 0. 
preferred, for as will be seen in Chapter II, the left side 
the Fredholm determinant 

00 

D(z) = Y\(\+\ jZ ), z = 

j=i X 

and will be real whenever z is real; det G(T) by itself may be purely imaginary. As 
a function of z, D{z) oscillates with ever increasing amplitude as z — -oo. 

The eigenvalues X of (A- 18) are calculated by solving (A-22) by the secant 
method. The left side of (A-22) is computed as a function of X by first finding the 
2n roots (3/ of the algebraic equation derived from (A-20), 

\P{~m - N{-i$) = 0, p = Pl , f} 2 , ... , p 2n , (A _ 23) 

and substituting them into (A-22). Here N(<*) and P(o>) are the polynomials of 
degree 2n in the numerator and denominator, respectively, of 4>(g>) as in (A-2). The 
determinants can be evaluated and the algebraic equation (A-23) solved by standard 
computer routines. The search is conveniently started by taking X just below the 
maximum value of the spectral density .3>(w) for o real. 

In order to see how to modify (A-22) to accommodate multiple poles of ■the 
spectral density 0(a>), suppose that n, and (x 2 have coalesced to form double poles 
at i^i and -i>i. Let \x 2 = M-i + e, e « j^J. Just before their coalescence, the first 



(A-21) 
(A-22) 

. For computation (A-22) is 
of (A-22) is proportional to 
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and second rows of G(T) are nearly equal, and so are the {n + l)th and (« + 2)th 
rows. The determinant det G(T) is unchanged if we subtract the first row from the 
second, element by element. Each element of the new second row is approximately 
e times the derivative with respect to \i\ of the element above it. Factor s from 
the elements of that row. Carry out the same procedure with the first and second 
rows of det G(0). The es will cancel from (A-22), and we can now lei e go to zero. 
The first row of both det G(T) and det G(0) will be unchanged; the elements of the 
second row will have been replaced by the first derivatives S/5(x t of those in the first 
row. The same change must be made in the (n + 2)th row of both G(T) and G(0). 

We thus see that if 0(a)) has poles ±/jx of, say, order/? > 1, the upper halves of 
det G(7") and det G(0) will contain p - 1 rows whose elements are the first through 
the (p - l)th derivatives, with respect to jx, of the elements of the row involving p., 
and the same will be true of the lower halves of det G(T) and det G(0). 

Once the eigenvalues are known, the eigenfunctions of (A-18) are found by 
solving the homogeneous equations G(T)c = for each set of 2/z c/s. These will 
contain an arbitrary constant factor. One substitutes them into (A-l 9) and evaluates 
the constant factor from the requirement 

fV(0! 2 ^ = 1. 
Jo 

The p/s in (A- 19) will have been found from (A-20) in the course of searching for 
the eigenvalue X. 

For the spectral density in (A- 17) with T = 1, (A-20) reduces to the algebraic 
equation 

p6 _ 4j3 4 + 4(1 + 6z){3 2 _ 16(1 + r ) = o, r = --. 

The roots 3,- are the positive and negative square roots of the roots of the cubic 
equation 

x 3 - 4x 2 + 4(1 + 6z)x - 16(1 + r) = 0, f3 = ±Vx. 

As before, 

(XI = 2, |X 2 = 1 + /, (X3 = 1 - (, 

|X4 = -2, jx 5 = -1 + /, H6 = -J - /, 

in (A-15). By solving (A-22) by the secant method, the eigenvalues listed in Table 
A-l were obtained. The second column of Table A- 1 lists the frequencies to* such 
that \k ~ <t»(o>A-) in order to illustrate the fact that the eigenvalues tend to be sam- 
ples of the spectral density <3>(oi) at angular frequencies separated when k » 1 by 
approximately ir/7\ 

The eigenvalues in Table A-l sum to 0.9998904; the sum of all the eigenvalues 
must equal 

The slow decrease of the spectral density, and hence of the eigenvalues, to zero with 
increasing frequency and index k accounts for the discrepancy. 
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Table A-1 Eigenvalues and Equivalent Frequencies 






0.8181314 


0.59627 


0.1557310 


1.04294 


1.939974(-2) 


1.84274 


4.290960(-3) 


2.72199 


l.316924(-3) 


3.67553 


5.265503(-4) 


4.63280 


2.450937{-4) 


5.61580 


1.293174(-4) 


6.59398 


7.396168(-5) 


7.58604 


4.541252(-5) 


8.57257 



Problem 

A-1. By solving (A-22) calculate the first six eigenvalues of (A-18) for the spectral 
density 



and for T - 1. For each eigenvalue calculate the two angular frequencies oj 
such that \ = (J>((o). Compare the sum of your eigenvalues with the sum of 
all eigenvalues \k,\ <k < oo. 
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Appendix B 



Circular Gaussian 
Density Functions 



To prove that (3-40) indeed represents the joint probability density function of the 2n 
random variables x\, xi, ... , x„, y\ > y>2, ■■■ , y n , we shall show how it can be reduced 
to the conventional form (2-1). We take the expected values of all the variables equal 
to zero; to go from (3-40) to (3-41) is straightforward. 

The calculation is simplest in matrix notation. We write the 2j;-element column 
vector made up of the real random variables x,„ and y m , 1 < m < n, arranged 
vertically, as 

1 x 

y. 

where x is the w -element column vector of the x,„'s and y the ^-element column 
vector of the y,„'s. The 2n x 2n matrix 4> in (3-38) is similarly written in block 
form, 

HS: t} 

the elements of 4> A - = ||4>av»"II and <j>,, = ||cf>, VJHI || are defined by (3-37). 
Let T be the matrix 



- 0-1/2 



T = 2 



(B-2) 



i a 
i -/i 

with I the n x n identity matrix. (A scalar in front of a matrix multiplies each 
element of the matrix.) The Hermitian conjugate of T is 



+ l/2 f I I] 
and by the rules of matrix multiplication 



(B-3) 
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TT + = T + T = ^ Jj, 



where is the n Xn matrix of zeros. Thus the matrix T is unitary, and in terms of 
it we can write 



(z*V) = 2 l/2 (x r y r )T + , (B-4) 



where z is the column vector with elements z m - x m + iy m> 1 < m < n. The su- 
perscript T indicates a simple interchange of rows and columns, or a transpose, 
without complex conjugation. The row vector z* r has elements z*, 1 < m < n, and 

Z 7 " = (21,22, •.. ,2«). 

Working backward, we write the quadratic form in (3-40) as 



(B-5) 



where 



e = ii*;iw* = i(»-V)[* 

by (B-2) and (B-3) and the rules of matrix multiplication. 

When as here the matrix fx is Hermitian, fi T = £*, the matrix M can be written 

M= [£ ~Z\ " = ^ + ^- 

Then the quadratic form in (B-5) becomes 

C = (xV)[ £ "£][*] = ^fex - x%y + y r M + y r fcy 

" " (B-7) 

With <j> = and because T" ] = T + , we find, by (B-l) and as in (B-6), 

M -= T+ [: J+=[fc 1;]=*. * = (b-8) 

so that the matrix M of the quadratic form Q is indeed the inverse of the covariance 
matrix <l> of the 2n random variables *i, ... , x n , yi, ... , y n . Furthermore, because 
detfTT) - detT + detT = 1, 

det * = det T* det[ * *~ r j det T = (det $) 2 . (B-9) 

Putting (B-7) and (B-9) into (3-40), we find the usual form for the joint probability 
density function of those 2n random variables, 

p(x u - , x H7 y u ... , y n ) = (2Tr)-"[det *F ,/2 e~^ 2 . 
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A useful integral is 



(2ir)- 



exp 



4X 



m~\ k = l at-] 



= det 4> exp 



Hf = 1 A = l 



(B-10) 



where <j> - M-" 1 , and the v,„'s and the w,„*s arc arbitrary complex numbers. Now the 
matrix jx is not necessarily Hermitian, and the value of the integral may be complex. 
It is only necessary that the integral be finite, which requires (i to be such that 
Rc(z* r jiz) is a positive definite quadratic form in the 2n variables Xj, yj, 1 < j < n. 
We shall use (B-10) in Chapters 1 1 and 12 under circumstances in which fX has the 
form |xi + with jjli and jjl 2 Hermitian, but z is a complex variable, whereupon 
|xj + z|x 2 is not Hermitian. 

To derive (B-10) we introduce the 2n complex numbers {s[, ... , s' in s[', ... , s'J), 
which make up the column vector 



a - 



composed of the two n -element column vectors s' and s" of the sj and the s" , 
respectively. They arc defined by the transformation 



a - 



_ 2-1/2 j+ 



w 

V* 



in terms of the u, J( 's and the u',„'s, whence 

= 2 1/2 T 



[;] 



V~ 




V + is"' 


s" 




s' - is" 



(B-ll) 



In addition, if we again use the superscript 7 to denote the transpose of a vector 
or a matrix — without complex conjugation of the elements — , we can write the row 
vector 

(v^v/ 7 ) = ^VV^T 4 , (B-12) 
and by (B-4) the second summation in the exponent in (B-10) is 



Uv* T yv T ) 



= (s'V r ) 



- s' 7 x + s' /r y. 



By (B-5) and (B-9) the integral to be evaluated in (B-10) is now 



/ = (2-ir)" 



exp 



11 dx k dyk> 



k = l 



with Q given in (B-5) in terms of the matrix M as defined in (B-6). 
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Now for an N x N symmetric matrix C and a row vector a T ~ (aj, ... , a N ), 

100 /•<» 
... exp(-±t r Ct + a r t) dh . . . A/v 
-OS J-00 

I™ i-» r Af iif . iv 1 
- ex P 4 X I c ^'; + X «<<< *i - ( R " 13 ) 
-» L i=i J 

= (2ir) JV/2 (det C)" 1/2 exp(ia r C" , a), 

and 

a r C~'a = £ £ c ') a ^ = ar<:a ' c = C" 1 . 

This result holds even though the elements of the matrix C are complex, provided 
that the integral is finite, and that requires that 

N N 

£X(ReC r K(,>0; 
'■=i y=i 

that is, the matrix Re C must be positive definite [HeI91, pp. 237-9]. The proof 
rests on the ability to transform the quadratic form t T Ct into a sum of squares by 
a linear transformation y = Ut, where U is an orthogonal matrix, and this can be 
done provided the matrix C is symmetric, Qj = Cy,-, whether or not its elements are 
complex. 

We now take N = 2n, C = M = 4H, c = 4>, det C = det M = (det <j>) _2 , 
and a 7 " = (s' r s" T ); and we observe from (B-6) that the matrix M is symmetrical: 
M = M r . Thus we obtain from (B-13) 

/ = det <j> exp(±a r «l>a). 

By (B-8) the quadratic form in the exponent can be written as 

Vl T [fl = ^ r ; r) [o Jr][;] 

= i(z>* r $ W + w r *V) = v* T im = X X 

m=l k=l 

which is the exponent in (B-10); here we have used (B-8), (B-ll), and (B-12). The 
characteristic function in (3-42) can be derived from (B-10) by replacing z>* by iw* 
and w m by iw m and by taking <j> as the complex covariance matrix of the n circular 
complex Gaussian variables zi, zi, ... , z„. 
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Appendix C 

Q Function 



C.1 Properties 



The probability density function 
4 (a, .\') = x 



t exp[-4(x 2 + a 2 )]J {ax)U(x) 



(C-l) 



is called the noncentral Rayleigh, the Rayleigh-Rice, or the Rice-Nakagami density 
function. It describes the distribution of the distance from a point in a plane to 
the origin when the Cartesian coordinates of the point are independent Gaussian 
random variables of unit variance and expected values a cos i|> and a sin i|i an 
arbitrary angle. The complementary cumulative distribution 



is known as Marcum's Q function. Some of its properties were given by Rice [Ric44j, 
and it was extensively calculated and utilized by Marcum [Mar48], [Mar50]. In this 
appendix we shall list some of its properties, with brief derivations of a few of them. 
In Sees. C.2 and C.3 we treat the generalized Q function Q M (a, 0) as defined in 
(4-27), and we present methods for computing it. These can be applied to computing 
Q(a, 0) as well by taking M = 1. 

Particular values of the Q function are 



Q(cl, p) = Pr(x > 0) = q(a, x) dx 



(C-2) 



6(<*,0)= 1, fi(0, P) - e-? n . 
The normalization equation Q(a, 0) = 1 gives us the useful integral 



(C-3) 
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J -co 
xe~ ax2n k{bx)dx = le b2/2a (C -4) 


by a change of variables. 

The generating function of the even moments of the distribution is 

J o e 2x2 q(a,x)dx = (\ - 2z)~ ] exp^j—^ 

by (C-4). The resemblance of this formula to the generating function of the Laguerre 
polynomials, 



[Erd53, vol. 2, p. 189, eq. (17)], gives us the moments of even order: 



In particular 

E(x 2 ) =2 + a\ E(x 4 ) = 8 + 8a 2 + a 4 . 

Even moments of higher order can be calculated by the recurrent relation for the 
Laguerre polynomials, 

(« + l)L M+ ,<» = (2b + 1 - y)L n {y) - nL n ^{y), 

L Q (y) = l, Li(y) = l- y , (C_6) 

[Erd53, vol. 2, p. 190, eq. (23)]. 

The moments of all orders are most conveniently written in terms of the con- 
fluent hypergeometric function, 

= V*T{\m + 1; -±a 2 ), ( C -7) 

which reduces to (C-5) when m equals 2n [Ric44, part 2, eq. (3.10-12) p 1071 In 
particular, J ' 



and 



^■(?r^(-kK?) + H?)] 



^ 3 )=(f) ,/2 -K^-^3> (^ + 4 + > a2)/1 ^j 

which can be obtained by expressing the Bessel functions as confluent hypergeomet- 
ric functions and applying the recurrent relations for the hypergeometric functions 
[Ric44, part 2, eq. (4.2-3), p. 119], [Mid60a, p. 1076]. It is shown in [Hel90] that the 
quantities a m = E(x m )/m\ obey the recurrent relation 



"OT+z " ■ 



(m 2 ~ \){m + 2) ' (C_8 > 

The partial derivatives^tlje Q function are 

(C-9) 
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which is obtained by differentiating the defining equation (C-2) and integrating by 
./parts, and 

ft dQ< ?>® = -p e-^\{*$). (C-10) 

j The formula 

I Q(a, (3) + g(P, ot) = \+ e -5 ( ° 2+ ^»/ (a(B) (C-li) 

dan be proved by noting that the first partial derivatives of both sides with respect 
to a are equal, by (C-9) and (C-10). Because the formula holds at a = by (C-3), 
it must hold for all values of a. 
The equation 

I G (o, P) = e ^ fa^+fi!, I (I) /.(aP) (C-12) 

[ n=0. 

^iolds at a = by virtue of (C-3), and its partial derivative with respect to a agrees 
with (C-9), as can be shown by using 

! , 

da 

Hence it is valid for all values of a. Combining (C-12) and (C-l 1) and interchanging 
a and p, we get 

Q(a, p) = 1 - e-W+& £ (f ( C ' 13 > 

This formula was attributed by Rice [Ric44] to W. R. Bennett. 

The following asymptotic formulas are given in [Ric44, pp. 108-9]: 



ap » 1, a » p - a » 1. 

For ap » 1, a - p » 1, 

0(a B) « 1 - — f^f [ 1 " + ...1 

(J(a, P) ~ 1 a _ p ^ 8ap(p _ a)2 J 

This can be reduced to 

1/2 

by using the asymptotic form of the error-function integral, 



erfcx » -4= e"** 2 f 1 " \ + A " "1 x » 



(C-15) 



[Abr70, p. 932, eq. 26.2.12]. By using (C-ll) with the asymptotic form of the Bessel 
function ^(ap), 
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(C-17) 



[Abr70, p. 977, eq. 9.7.1], and with (C-15), we obtain 

1/2 



«P » I, 3 - a » 1. 
C.2 M to-order Q Function 

The Mth-order or generalized Q function is defined by 

Qm(*, P) = j^x Q e-^^^/w-Kax) (C-18) 

For computational purposes we shall also consider the equivalent form 
. M (S,y) = Qm^IS,^) 



(C-19) 



which represents the probability that a random variable having the probability density 
function 

/ WA/-l)/2 

p( w ) = e- s - w I M -i(2VS^)U(w) (C-20) 

exceeds y; w = I* 2 , 7 = 50 2 , and 5 = \a 2 . 

As in (4-24) the moment-generating function of this random variable is 

*<*> = ^-») = ^ exp( T |-). (C-21) 

By using the generating function for the associated Laguerre polynomials as given 
in [Erd53, vol. 2, eq. 10.12(17), p. 189], we can write h(z) as 

A(z) = J = X ^(SX-zY, 



n=0 n - n =0 



whence the moments of w = are 

E(w n ) = n\Ll, M - l \-S), 

which reduces to (C-5) when M = 1. The associated Laguerre polynomials are 
defined by 

and they can be calculated by the recurrent relation 

(« + UzKiOO = (2« + « + 1 -y)L^(y) - (n + a)L ( n %(y), 
L (y) - 1, .L[ (jO = a + 1 - y 
[Erd53, vol. 2, p. 190, eq. (23)]. 
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Corresponding to (C-13) the generalized Q function has an expansion 

CM«,f$) - 1 - ,^"-^ i2) f (£)"/„(«&). (C-23) 



as can be shown by differentiating both sides with respect lo p and vising 

d 



[P'7„{«P)] - a(3'7 /( _|(ap). 



Hence by (C-13) it is related to the first-order function by 



Because /-„(a{3) = /„<a|3), wc can write this as 



and by using (C-I2) for Q(a, (3), we find 



0a,<«, P) = X U '*(aP). 



(C-24) 



A trigonometrical integral for the generalized a function can be found by 
substituting the integral 

/,(«3)= CcoskBe^™* — (C-25) 
Jo 2tt 

into (C-24) and interchanging summation and integration: 

e H (a, S3) - e -^« :+p? ' 



V^i^)\os^ 



2l7 



Writing cos /cO = ±(e*° + IJ ). summing the resulting geometric progressions, and 
combining, we find 

G,v(a, P> 

nfUl-cosll). 



_ /PV W . f^.-^i-^^Pcost/V/- l)e_-_a_cosA« rf£ 



~~ \ a ) ac ' ;<|i '° J (1 C " a-- + f3 : - 2od) cos"B 2tt' 

(3 > a. 



(C-26) 



For a > f3, on the other hand, we make the same substitution into (C-23), and the 
same procedure leads to the integral 
1 - Quia, (3) 



- f PY 



. o 



-„ P n^m^c_os MB-|3cos (A^-l)6 dQ 
£ a 2 + p 2 - 2cxp cos 6 2ir' 



(C-27) 



p < a. 

If one has a programmable calculator or computer software with a built-in 
numerical integration routine, the most easily programmed technique to compute 
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the function Q M (a, p) is to integrate the right side of (C-26) numerically. When 
oc> p, the result is negative and equal to - [1 - Q M ( a , p)]. It suffices to take the 
integral from to ir and multiply by 2. Because the integrand resembles a semi- 
Gaussian of width on the order of (aP)" 1 ' 2 peaked at 8 = 0, it is advisable to break 
the integral into two parts, one running from 6 = to 6 = 6i, the other from 6 = 6] 
to ir, where 6, is on the order of (lO/aP) 1 ' 2 . Such a program may take longer to 
execute, however, than the methods to be described in Sec. C.3. 

When a = p, we must proceed as follows. Combining (C-23) and (C-24) yields 

M— I / Q \ w 

3) + Qm{% a) = I + .T^+P 1 ) £ (H) um% 

n=l-M ' 

so that 

w-i r M _) -| 

<2*(*.a) = i + ie- £ A(a 2 ) = ^ e - 2 I/ (a 2 ) + X/ A (a 2 ) , (C _ 28 ) 

k=\-M L *=I J 

which can be evaluated by using tabulated values of the modified Bessel functions 
and their recurrent relation 

2w 

J„+i(x) = I„-,\(x) I„(x). 

Lacking a table, one can numerically integrate 

Q M (a, a) = I + I f 2 V t i-case)Sin(M^l)6 <tt 
2 Jo sin|e 2ir* 

which is derived by putting (C-25) into (C-28). 

C.3 Computation by Recurrence 

We evaluate the generalized Q function in the notationally more convenient form (C- 
19), and we utilize the moment-generating function in (C-21). Applying the power 
series for e x , we write this as 

S> 



h{z) = e~ s J ^-(1 + zT M - r 



whose inverse Laplace transform immediately yields for the probability density func- 
tion 

p{w) = e s £ — 



S r! (Jlf + r-l)!* 

This also follows from the power-series expansion of the Bessel function in (C-20). 
Integrating it term by term, we obtain for (C-19) 

00 

Om(S , JO = £ brgM+r-l , (C-29) 
r-Q 

where 

S r e~ s 1 f 00 . 

b r Sk = ~ j w k e w dw. (C-30) 
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Integrating the latter by parts, we find the recurrence 

gk - §k + g*-i. S = go = e 



which with the recurrence 

b r = -/>,-i, 60 =e- s , (C-32) 



r 



defines an easily programmed algorithm for computing the A/th-order Q function 
through (C-29). It is most suitable when the limit y of integration lies above the 
expected value E{w) - M + S of the density function in (C-20). 

When y is a decision level chosen for a certain false-alarm probability Qq, 

y r e -y 

,■=0 ' ■ 

whereupon (C-29) becomes 

Qm(S, y) = Qoe~ s + X b rgM+r-\, y>M + S, 

r = ] 

which is sometimes a useful form. This algorithm was given by Dillard [Dil73] as 
an extension of McGee's [McG70] for the ordinary Q function Q(a, (3). 

It is shown in [Hel92d] that the error E k incurred by terminating this series 
with the /cth term is bounded by 

E k =s VM/^, 

provided k > S and k » 1, One can stop the summation in (C-29) when the 
quantity on the right side, divided by the accumulated sum, falls below a preassigncd 
relative error. 

When y < M + S, it is preferable for the sake of accuracy to compute 1 - 
M (S t y). From (C-31) 

A" 

gk ~ X 6 "" 
m-0 

so that (C-29) can be written 

co M+i— 1 

and co co 

1 - Q M (S,y) = ]►> X 8 «" 

,=0 m=M+r 

Changing the order of summation yields 

CO »7 

1 - Q M (S, y) = X S "< + « X h '- = X h »$>»+M> y <M + S, (C-33) 
„,=o i=0 m=0 
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where the h m 's are computed by the recursion 

h m = h m - x + b mi h = b = e~ s , (C-34) 
the b m 's being calculated as in (C-32). The S w+A /s are calculated as in (C-31) 

It is shown in [Hel92d] that the error E k incurred by terminating this series 
with the kth term is bounded by 

provided that M + k > y and k » 1. As described above, this bound can be used 
to tell the computer when to stop summing. 

Both these recurrent algorithms permit easy generalization to situations in 
which either the signal strength S or the decision level y or both are random vari- 
ables; see Appendix E. Related algorithms were given by Brennan and Reed [Bre65] 
Robertson [Rob69], and Shnidman [Shn89j. When M is large, all these methods 
suffer from computer underflow or overflow: for ordinary values of the false-alarm 
and detection probabilities, S and y are then large and e~ s and e~y extremely small 
Complicated stratagems are necessary to cope with that problem, and the contour- 
integration methods described in Sec. 5.2 are preferable. Quite a different method 
has been described by Pari [Par80], but it too is inefficient for large values of M 
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Appendix D 



Error Probability for a 
Channel Carrying Two 
Nonorthogonal Signals 
with Random Phases 



According to Sec. 3.5.2, we want the probability that Rq > R\, where 

RiA\ T Q?{t)V{t)dt, z=0,I, 
\fo 

under hypothesis H\ that V(t) = F } (t) + N(t). Here N{t) is the complex envelope 
of Gaussian random noise with autocovariance function 

<Mt) = Re 4>{r)e ia \ 

and Qo(t) and Q\{t) are solutions of the integral equations 

Fi(t) = ("$(*- u)Qi(u) du, < t < T, i - 0, 1. 
Jo 

The signals are so chosen that 

\ T Qi(W(t)<& = \ T QS(tyFv{t)dt = d 2 . 
Jo Jo 

For detection in white noise this means that they are received with equal energies E, 
and d 2 = 2E/N, In this analysis we follow [Hel55]. 

We introduce the circular complex Gaussian random variables 

zj = Xj + iyj = \ T Q*(t)nt)dt, j = a, 1. 
Jo 
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The joint distribution of their real and imaginary parts x > >'o, and y] is conve- 
niently written in terms of z and z\ in the manner introduced in Sec. 3.2.3. Their 
expected values are 



(D-l) 



E(zo\ Hi) = £ = e'* C QS(t)Fi(t)dt = kd 2 
Jo 

E{z x \H x ) = £, = f QfiWit) dt = rf 2 
Jo 

where the complex number X is defined by 

\ = d~ 2 \ QS(t)Fi(t) dt =d- 2 \ QoUMt - u)Q } (u) dt du. 
Jo Jo Jo 

The variances and covariances of x , y , x u and y\ can be obtained from the 
relations 



Uo, 2 *} = {z h z,*} = \E f f QZ{h)Qo{t2)N{t x )N*{t 2 ) dh dt 2 
Jo Jo 

= J o J o fi£ (*i)Go(fe)$(fi - 1 2 ) dti dt 2 (D _ 2 ) 

= \ T QS(tmt)dt =d\ 
Jo 

feo,**} = J J o flo(/i)fii(?2)iai - ^2 = Xrf 2 . (D-3) 
Here we use the notation 

{w u wj} = iEliwi - E(w0][w 2 - E(w*)]} 

for the complex covariances of two circular complex variables w\ and w 2 . The 
complex covariance matrix of the complex variables z 0> Z], as defined in (3-36), is 
therefore 

. _T d 2 Xd 2 ! 
* - [ X'rf* </ 2 J' 

its determinant is 

det<f> = d\\ - |\| 2 ), 

and its inverse is 

*" = * = imhm I -1* t]- 

Thus the joint probability density function x Q , y , xu and y\ can be written in 
circular Gaussian form as in (3-41): 

p(xo,yo, x it yi) = p(zq, z\) 

= 1 ~J_ i g o ~ ^ + " £i I 2 " 2 Re Xfro ~ SoXzi "SO I 

(2ir)2rf+(l - |\P) 6XP [ 2dHl - |XP) J" 

When we substitute from (D-l), we find that a number of terms cancel, and we 
obtain 
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p(Zo ' 2l) (2ir) 2 rf 4 (l - IM 2 ) 

f IzqI 2 + |zil 2 + (1 - |\l 2 y 4 - 2d 2 (\ - \\\ 2 ) Re(^|g- j n - 2 Re(Xz *z t ) 1 
' CXP [ 2rf 2 (l - |\P) J' 

Introducing polar coordinates, we write 

2 = Roe'**, z } = *i e'* 1 , X = |X| 

The joint probability density function of the random variables Ro, Ri, <|>o, and $] is 
then 

m*o, 4>o, 4),) - ^mv^W) e 2 

[ Jgg + R] - 2rf 2 (l - jX| 2 )^, cos^i - - 2lM*ofr cos(4)i - 4> + ft) ] 
' eXp [ 2rf 2 (l - |XR) J 

after we multiply by the Jacobian factor RqR\. We first integrate over < §o < 2ir 
and then over < §\ < 2ir, and by using (3-61) we obtain the joint density function 
of R and Ri under hypothesis Hi: 

p(*o> *o = ^rrjxp) exp [~ 2 ^(i - ixi 2 ) - yJ /o(J?,)/o U 2 (i-!xp);- 

The probability of error is 

J "00 /"0O 
dR x dR$ p(R§, R\). 
JRi 

Introducing the new integration variables 

Ri _ Ro 

x = — -===, y - 



(D-4) 



dJT^W dJT^W' 
we write the error probability as the double integral 

P e = (1 - |Xl 2 )«-^|*xTo(rfVl - Wx)e^ x2 dx^ye'^I (\k\xy)dy 

= (1 - 1X| 2 ) e"^ fx e-^-M 2)x2 Io(dJ\ - (X| 2 x) 
Jo 

■ J% €-^ J "* WljfI /o(IXlxv) <fy 

= (1 - |X| 2 )<rK fx e -^(HM 2 )^ /o( ^j _ |xPx)e([X|x, *)rfx 
Jo 

in terms of the Q function as defined in (3-76). 
Let us now write (C-26) for M - 1 as 

* ( ' W C Jo 1 + (a/P) 2 - 2(a/P) cos 6 2ir' V ; 

which yields 

QQ\\x,x)-e j Q * i + |xP-2|X|oosG2ir- 
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Putting this into (D-4), we integrate over < x < oo by means of (C-4), 

fx e-* 2 «- M ™%(dfi^x) dx = * expF^ 2(1 ,' IM2) 1 

Jo v ■ ' 2(1 - |X| cos 6) p [ 4 " IM cos G) J' 

and we obtain for (D-4) the trigonometric integral 

Pe " 5(1 1X1 )e Jo 1 - 2|X| cc 6 + |XP eXP |_ 4(l-|Mcos e) j^ CM) 

For simplicity, we henceforward drop the absolute-value signs on |X|. 

This integral can be simplified by changing the integration variable by 

n X + cos cb 
cos 6 = ^ 



1 + X cos <[>' 

for which a certain amount of algebra produces the relations 

1-X 2 



1 - X cos 6 = 



1 + X cos ((> ' 



l + Xcosfr ^ 

i + x^2Xco S e^ (1 - K2)(1 "^ cos ^ 

1 + X cos <j> 

and these enable us to write (D-6) as 

Pe = wn^-*-» r 2 -ex P gx^cos^) d* 

2 Jo l-Xcos4> 2tt" 

If we now write (C-27) for M = 1, a > B, as 

1 - Ofot B 1 ) - R .-^(o 2 +P 2 ) f ^ag cos4> a COS <[) — R i/<j) 

^ ' P) P Jo a 2 + R2-2apcos<|) 2^ 

and add to it (D-5) after interchanging a and B, we find 

. - fi(-. 3) + 2(0, «) = e-^> t ^-^T*, 

Jo or* + B 2 — 2aB cos <)> 2ir 

Comparing this with (D-7), we see that the integrals have the same form after we 
put 

that is, if we take a and fi as 

a = \d[\ + Vl - X 2 ] 1/2 , 
B = \d[\ - Vl -X 2 ] i/2 . 
The probability of error thus becomes 

£ = i[i - 2(^vTTT, ^VT^I) + Q{\dJY^k, \d4YTk)\ 

I (D-8) 
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as in (3-82). The second form given there results from using (C-ll) for the first two 
terms in the brackets. A number of formulas of this type have been given by Price 
[Pri64]. 

The probability of error is equal to the probability that the quadratic form 

U = i(|z,l 2 -|*ol 2 ) 

is less than zero. This is a special case of the general Hermitian quadratic form 

U = {V + W 

in n circular complex Gaussian random variables (V\, V 2 , ... , V„), which we write 
as a column vector V; 

v + -(k;,k;,...,k;) 

is the conjugate transposed row vector, and H is an n X n Hermitian matrix. We 
designate the column vector of the expected values of the random variables by S, 
S m - E(V m ). The nXn complex covariance matrix of these variables is denoted by 

<$> = \E[<y - sxv + - s + )]. 

The moment-generating function of the random variable U then takes the form 

A(z) = E{rUz) = d^rhm) exp B 2S+(I + whs], (m> 

as can be shown by means of (B-10); I is the n X n identity matrix. If furthermore 
we are given a random variable U that is the sum of, say, M independent random 
variables Uk of this type, 

M 

U - X £4, U k = V A + HV,, 
k=\ 

and if the components of each \ k have the same covariance matrix <|>, but possibly 
different expected values S k , it is only necessary to multiply their moment-generating 
functions to obtain that of V: 



h(z) = [det(I + zH$)Y M exp 



M 

-\z £sJ(I + zH*i>r l HS* 



/c = 1 



(D-10) 



For example, Proakis [Pro89, pp. 344-9] utilized a form of this type in order 
to derive an expression for the probability that U > when the matrix H is a 2 x 2 
nonpositive-definite matrix 

H = [c- b] (D " u) 

His expression involves the Q function and a series of modified Bessel functions and 
is a generalization of (D-8). For the latter, C = 0, A - 1 , and B = -1 . 

As shown first by Rice [Ric80], the probability density function of a quadratic 
form such as U can be computed from the moment-generating function (D-9) or (D- 
10) by numerical integration of the Laplace inversion integral (5-4), and the numerical 
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methods described in Sec. 5.2 can be applied to evaluating the cumulative distribution 
of U and its complement. In the case of (D-ll), for instance, the matrix I + zHA 
involved m (D-9) and (D-10) is a 2 x 2 matrix, and its inverse and its determinant 
are easily programmed for evaluation of the integrands of (5-4), (5-19), and (5-20) 
when computing those inversion integrals numerically. 
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Appendix E 

Recursive Methods 

for Detection Probabilities: 

Fading Signals 

and Random Decision Levels 



B.I Fading Signals 

When the receiver sums or "integrates" M quadratically rectified outputs of a filter 
matched to the signal to be detected, as in (4-19), and the input signal-to-noise ratio 
ratios d$, ■■■ , d 2 M are fixed, the probability 

Q d = Pr(C/ > y\ H\) = G„(S, y) (E-l) 

of detection can be calculated by either of the two recursive methods in (C-29) 
through (C-32) and (C-33) and (C-34); here 

k=\ 

is the total energy-to-noise ratio, E T being the total received energy and N the 
unilateral spectra! density of the noise. 

' If now the input energy-to-noise ratios are not fixed, but random, the average 
probability Q ( , = Pr(t/ > y\ Hi) of choosing hypothesis H\ when signals are actually 
present can be calculated by recursive procedures derived from those in Sec. C.3 
simply by averaging the terms with respect to the resultant probability distribution 
of the total energy-to-noise ratio [Mit71]. Thus we find 
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where the g k are given as in (C-31) in terms of the decision level v, and where from 
(C-30) 

is the expected value of the term b m with respect to the distribution of the total 
energy-to-noise ratio S. The decision level y in (E-l) is still determined by the 
preassigned false-alarm probability 



Qo = ~Pr(U>y\Ho) = e-y V y ~, 



as in (4-30), where U = y. If h s (z) = E(e~ Sz ) is the moment-generating function 
of S, then 

If, for instance, the total energy-to-noise ratio S has a gamma distribution as in 
(4-39), h s {z) is given in (4-40), and we find 

&>> = (! +s'r k , 

T(k)m) (i + s') k+m 
k+m-l { S' V v (E-3) 



v _ E(S) _ S 

One uses (E-2) and (E-3) when y > E(U\ H x ) = S + M. Here k = I, M, 2, and 
2M for Swerling cases 1, 2, 3, and 4, respectively. 

When on the other hand y < S + M, we modify (C-33) and (C-34) in the same 
way. Now 

00 

with the S* still given by (C-31), but with the coefficients (h m ) formed by averaging 
(C-34), 

{ho) = <b ) = l —, 

(1 + S'y (E-4) 
<U = {h m -x) + (b m ), 

and the {b m Ys are calculated as in (E-3). 

E.2 Random Decision Level, fixed Signal Strengths 

In the constant-false-alarm-rate (CFAR) receiver studied in Sec. 8.1, the decision 
level y = U with which the quadratic statistic U of (4-19) is compared is made 
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proportional to an estimate of the unknown spectral density N of the ambient white 
.noise; that is, y = px, where x is a random variable with a gamma distribution 



z(x) = — (E-5) 

V } {M'-X)\ 

when suitably normalized. Here M' is the number of quadratically rectified outputs 
of a narrowband filter that are integrated to form the estimate of the spectral density 
N; these outputs are assumed to have no components due to the signal to be detected, 
and they are supposed to have identical exponential distributions. The constant p is 
set to provide a preassigned false-alarm probability, which is calculated as in Sec. E.4. 

We assume that the M input signal strengths, proportional to d\, ... , dls, 
are fixed. The N figuring in the definition of d* - 2E m /N can betaken as a fiducial 
level and cancels out of the final result. The average probability Q d of detecting the 
signal is obtained by averaging the recurrent relations in Sec. C.3 with respect to the 
prior probability density function of y = px, p a fixed number. Thus when 

E(y) = $E(x) = (W > S + M, 

we compute the average probability of detection from 

Q A = X 6 'W-1>, ( E - 6 ) 

where the b/s are calculated as in (C-32), and where by (C-31) 

(gs) = <8,) + (gs-l) 



and 



< 8 ,) = ^ = f 

s\ J( 



o 5! (M' - 1)! 
T{s + M')P S _ s + W - 1 , R . (E-7) 



s\T(M')(\ + p) s+M ' -v 1 + P 

<^o) = <8o> = (i + pr M '- 

For pM' < S + M, on the other hand, we compute 1 - Q d from the average 
of (C-32), 

00 

with h m calculated as in (C-34) and the <8. v ) as in (E-7). 

E.3 Random Signal Strengths and Random 
Decision Level 

When both the signal strengths and the decision level are random, we average the 
terms in the algorithms of Sec. C.3 with respect to both. Assuming the same gamma 
distributions as in (4-39) and (E-5), we find as in Sees. E.I and E.2, from (C-29) 
through (C-32), 
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Q d = Pr(tf > M J?,) = £ (UW^X (E-9) 



m=0 



with the (fcj's calculated as in (E-3) and the {g s )h as in (E-7). One utilizes (E-9) 
when [W > E{S) + M. When 3Af' < E(S) + M, on the other hand, one utilizes 
the average of (C-33) and (C-34), 



with the <8 s )'s calculated as in (E-7) and the (AJ's calculated as in (E-4). 

EM False-alarm Probability, Random Decision Level 

From either (E-6) or (E-9) we find, for S = or E(S) = 0, the false-alarm proba- 
bility 



Qo = Pr(U > M H ) = (sm-i) = £ <8,> 

= 1 v' r(j + M') ( 3 y 
(l + 2- j!r(M0 ii + pj ' 

m which the <8 s )'s can be computed recursively as in (E-7). For M » 1 calculating 
this sum can be tedious, and a more rapidly converging series results from notine 
that 6 

Qo = 1 - Pr(f/ < 0*| #„) = 1 - p r ^ > £| 

and utilizing (E-8) with M and Af' interchanged, £ replaced by H , and h m == h = 1 
whereupon ' 



fl> = £ 



F(m + + MO p-**-^') 



fr (m + M')\T{M) (1 + p-ip^' 
= ( P f V F(m + M + M') 1 

ll + pJ AIW 0+P) 
[Rob76]. This sum can be computed recursively: 

*^ U + 3/ ' A(l+p) 



One generates the e*'s starting from k = 0, but does not begin summing them 
till k - M'; and one stops when the last term added in falls below e times the 
accumulated sum, with e small enough to ensure the number of accurate significant 
figures desired. 
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The probability Qo thus calculated is the probability that the ratio of two chi- 
squared-distributed random variables tii and T| 2 with f\ = 2Afand/ 2 = 2M' degrees 
of freedom, respectively, exceeds a level p. Here 

A h 

where the xj and x/ are independent Gaussian random variables with expected values 
zero and equal variances. Then 

2o = Pr(P'>e) = e(P;/i,/2) 

is called the complementary ^-distribution and has been tabulated by Pearson 
[Pea68]. The random variable figures in statistical tests of linear hypotheses as in 
the analysis of variance, and tables of its percentage points are to be found in many 
statistical handbooks under the rubric F distribution. The probability Q d derived in 
Sec. E.2 then corresponds to the noncentral F distribution, and our parameter S, 
or sometimes IS, is termed the noncentrality parameter [Tan38], [Pat49], [Rob76]. 
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Appendix F 

Pulse Reflected from a 
Moving Target 



A radar antenna emits a pulse signal s(t), which travels toward a moving target and 
is reflected from it, and a portion of the reflected pulse is picked up by a receiving 
antenna in the same vicinity as the transmitter. We want to find the form of the 
received pulse when the target is moving away from the antenna with velocity V. 

Let F be a frame of reference fixed with respect to the antenna, which is at 
the origin of coordinates. The x-axis points in the direction of the target. In F the 
field of the transmitted pulse at point x is proportional to 



c = velocity of light. 

Let F 1 be a frame of reference moving with velocity V relative to F\ in F' 
the target is stationary. The coordinate in F' parallel to the velocity of F' relative 
to F is designated by x' and the time in F' by Then the Lorentz-Fitzgerald 
transformation equations relate t and x to t' and x' through 



with 



' = r ('' + "5")' X = T(X ' + WX (F " 1} 

r = [i-£j W (f-2) 

[Tol34, pp. 18-21]. At point x' in F' at time t 1 the transmitted signal field is 

•KH'O-tX''-?)} 
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as we find by substituting from (F-l). 

Let c/„ be the distance from transmitter to target in F'. Then the field at 
the target. when the pulse strikes it is given by (F-3) with .v ; = d' {h The reflected 
field is propagating in the opposite direction, but within an inconsequential sign 
and reflection coefficient it must have the same lime dependence as that in (F-3) for 



d Q . Hence the reflected pulse can be represented by 



■v,(.v', /') = s 



lit' + 



2dk 



1 - 



V 



We translate this into the coordinates of frame F by using (F-I), from which we 
obtain 

''^= r K)K). 



whereupon 



■v, {.v, /) = s 



rlvlt + ± 



24 



v 



(FA) 



Now the coordinate of the target at time 1 = when the pulse was transmitted 
was .v = cl Q in frame F and .y' - 4 in frame F' . By the inverse transformation to 
that in (F-I), 

.v' = T(.v - Vt\ 

these distances arc related by 

ni/2 



4 = Td lh d» = 4 



c 1 



(F-5) 



which expresses the Lorcntz contraction; to someone stationary in the frame F of the 
antenna, the distance d<> to the target seems shorter than to someone moving along 
with the target. Putting (F-5) into (F-4) and setting .v = 0, we find the received field 
at the antenna to be 



.v,(0,/) = .v 



y\ 2 2T 2 d 



when wc use (F-2). The term 



2cf n 



c - V 

represents the interval between transmission and reception of the pulse. The pulse 
is stretched in time by a factor 

c + V 



c ~ V 



which is greater than I when the target is moving away from the antenna. 
For a narrowband transmitted pulse 

s(l) = Re F(i)e ill! , 

the received echo, within a constant of proportionality, is 
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*(0, t) = Re[F(r(t-r))e iSl '< +i *\ 
To terms of first order in V/c, the angular carrier frequency of the received signal 



is 



where w = -(2K/c)0 is the Doppler shift. The stretching of the complex envelope, 
indicated by the factor r, is usually insignificant. 
When a train of pulses separated by T, 

M 

is transmitted, the echo signal is 

M 

s r (0, t) = Re £ F[r(t - t) - (k - 1)7*] e'W'-Mfc-im 
and to first order in V/c this is 

M 

,iil'[l-T-r-\k-l)T] 



j r (0, * Re X " t " - e' 

• expj/il' ^ - t - (l + ¥p)(k - l)T J. 

The factor r in the argument of the complex envelope can usually be neglected. The 
factor lv . 

( 1 + i) T 

in the complex envelope accounts for the displacement of the target by VT during 
each interpulse interval. 
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Appendix G 



Asymptotic Relative 
Efficiency of the Rank Test 



In order to calculate the asymptotic relative efficiency of the rank test, we must 
work out the effective signal-to-noise ratio of the rank statistic r as given in (8-35) 
and (8-36). We use the latter to determine the expected value of r under hypothesis 
H\. In that sum there are M terms with / = j and \M(M - I) with i f j, and its 
expected value is 



E(r\ Hi) = ME[U(gi)\ #i] + \M{M - l)£[t/(s,- + gj)\ H u i * Jl 



The first term is simply 




by (8-24), where 




is the cumulative distribution function of the datum g. Furthermore, 



E[U(gi +gj)\ Hui ±j] 
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in terms of the probability density function and the cumulative distribution function 
of the noise. The numerator of the expression for the effective signal-to-noise ratio 
of the statistic r is thus 

E(r\ H } ) - E(r\ H Q ) 

= \M{M - 1) j_jo(y)[Po(y + 2A) - P (y)] dy + M[Po(A) - J» (0)] f 
and because for A «: tr we can write 

Po(y + 2A) - P (y) » 2Ap (y), 

this becomes 

E{r\ H x ) - E(r\H ) « M(M - l)A [p (y)] 2 dy + 0(M). (G-l) 

J— CO 

Here (AO is the second term, which is of order M and becomes relatively negligible 
when we pass to the limit ,M a> 1. . . 

We also need the variance of the rank statistic r when no signal is present " This 
we obtain from (8-35), Introducing the random variables w k = U(g k ), for which 

Pr(w k = 1| Ho) = ?r(w k = 0| ff ) = I, 

and which are independent under hypothesis H , we can write 

M 

r - £ 

k=\ 

and its variance is 

M m 

Var r = £ £ 2 Var h>* = J£ A: 2 = -^M(M + 1)(2M + 1). 
Then for M » 1 the effective signal-to-noise ratio of r is, by (G-l) and (4-64), 

Dt = \2MA 2 \^[po{y)fdy^. 
For the statistic G in (8-19) the effective signal-to-noise ratio is 

and dividing we find the asymptotic relative efficiency as stated in (8-41). 
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Appendix H 



Probability of Detection: 
Lorentz Spectral Density 



HA The Residue Series 



When the stochastic signals to be detected in white noise have the Lorentz spectral 
density (11-47), the Fredholm determinant can be written 



m = u/T, g = /l + Kz, K = 



2E 
Mm' 



(H-l) 



from (11-128). Here E = P S T is the average energy of the stochastic signal for which 
the detector is designed to be optimum [Sie57]. When z is real and less than -K~ ] , 
the parameter g is imaginary, and putting g = ig' t we find 



cos mg' + 



2*' 



sm mg 



(H-2) 



= V-Xz - 1. 



By equating the bracket to zero, one finds an equation from which the eigen- 
values \ k can be calculated. (In this appendix eigenvalues are indexed from k = 0.) 
With z = — 1/Xa, 

2gj(- cos mgjt + (1 - &!■") sin mg* = 0, 

where 

g' t = (JTXi 1 - I)" 2 : 
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When as in Example 2-4 of Sec. 2.3.2 we substitute g' k = cot d k , we obtain 

m cot 6* = 20, + few, * = 0, 1, 2, ... , < 6 A < J^, (H-3) 
and the eigenvalues are 



X* = 



K 



= A: sin 2 9, . 



(H-4) 



The derivation of an expression for the parameter R k in (1 1-79) is quite tedious. 
For the time being we omit subscripts. In (H-2) we replace mg' by c, writing it as 

^ ) = e - m [ C0SC+ i(2_y sinc j 

and we observe that at z = -l/\ 



cote = 



: 2 - m 2 



Furthermore, because 
and 



2mc 

c = mg' = m{-Kz - 1) 1/2 
<fc 1 ... _. m 2 # 



(H-5) 



we find that 
dD 



m 2 K\ 



2c ' 



_m 2 ^[/ /c 2 -m 2 \ 1 M 



2mc 2 

Using (H-5) we write this as 
m 2 K 



D'(z) = 



2c 
m 2 Ke~ m 



[(^ m 2 + c 2 \ . cos 2 cl m 



2c sin c 
m 2 Ke~ m 



1 + 



2mc 2 

(m 2 + c 2 \ . 2 I 
(-2^J Sm C J 

) 



esc" c + 



2c esc c 

Now with g' = c/m we find from (H-4) that 

m 2 K 



(H-6) 



2mc 2 



X = 



m 



2 + C 2 



or 



and 
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2 _ 



2 £ 



c 2 =m 2 



K-\ 



(H-7) 
(H-8) 



esc 2 c = cot 2 c + 1 = 



2mc / " \ 2mc 
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e~ m 



from (H-5), so that by (H-8) ^ 

c esc c = ±^t-- (H-9) 

Putting this into (H-6), we obtain 

= v K{mK + 2\) 
" 4(K~\) 

and from (11-79), at last, with k = 0, 1, 2, ... , 

^ " ( U K{mK + 2k k ) e ■ 

The factor (-1)* arises because by (H-3) c = mg' k lies between /cir and (k + l)ir and 
esc c in (H-9) is therefore positive for k even and negative for k odd. This formula 
was derived by S. O. Rice. If we use (H-4), we can write it 

* - en* em :f. f ; ch-ho 

m + 2 sm 9ft 

as in [Hel83]. 

From (H-7) 

m 2 K I . . . fcir ^ _ (A: + 1)tt 

m 2 + 4 N T J 1 

Because the Lorentz spectral density drops off to zero with increasing frequency to 
very slowly, the eigenvalues decrease slowly to zero, and it is necessary to include 
a great many terms in (11-80) in order to compute the detection probability Q d {-^ 
accurately. The greater the time-bandwidth product m = \iT and the greater the 
input energy-to-noise ratio r\E/N> the more terms must be taken into the summa- 
tion. The coefficients R k , given by (H-10), alternate in sign, and when m » 1 their 
absolute values are large and lie close together for small indices k. The correspond- 
ing terms of (11-80) are then large and not much different in magnitude and must 
each be computed to high precision if their sum is to be reliable. For these reasons 
we found it necessary to turn to the method of saddlepoint integration for values of 
m = ilT greater than about 7. 

H.2 The Saddlepoint Method 

As described in Sec. 5.2, it is first necessary to find the saddlepoint of the integrand 
of (1 1-94) or (1 1-95). Because the Fredholm determinant as given by (H-l) is rather 
complicated, the most expeditious method of computing the saddlepoint seems to 
be by numerically minimizing the phase 

*(z) = In h^z)+ U z -lnM (H-l I) 

for real values of z. For (11-94) the saddlepoint lies to the right of the origin; for 
(11-95) it lies between the origin and the rightmost pole of h^(z), which by (11-75) 
is located at 
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1 + Xo 



(H-I3) 



XoO+tiXo)' 

where \ is the largest eigenvalue of (11-74). This eigenvalue can be rather quickly 
calculated by first finding the root 9 of (H-3) lying between and \tr and then 
determining X from (H-4). It is wise to do so in order to determine the region 
in which to search for the left-hand saddlepoint of the integrand of (1 1-95). This 
saddlepoint need not be located with high precision, and even a crude minimization 
of $(z) in (H-ll) will usually suffice. 

In accordance with the Neyman-Pearson criterion, the decision level U is 
chosen to yield a preassigned false-alarm probability 

a = - JV'«z) exp (CW J,, „ oW = ] m Ty (h-12) 

The quickest way to compute the decision level U seems to be first to approximate 
(H-12) by the saddlepoint approximation (5-56), neglecting the correction term 7 1 ,: 

Qo » [2Tr<E>"(z )]- i/2 exp <J>(z ), 
<*>(z) =■ In D(l) - In D{1 + z) + U z - ln(-z). 
At the saddlepoint z = z , 

$ ' w = -W^y + ^-; = °- (H-14) 

Solving this for U and substituting it into (H-13), we obtain 
In Qq * In £>(1) - In Z)(l + z) 

+ |t| + 1 - *«-*> - \ '«P-*"w] (H " 15) 

with 

= r^LLilf _ D "u + - ? > + 1 

w L£>d+2)J Z)(l+z) z2- 
The derivatives are evaluated numerically. One uses the secant method to find the 
value of z = z satisfying (H-I5), with Q the preassigned false-alarm probability, 
and then one substitutes into (H-14) and solves for the value of the decision level U Q . 
One then turns to numerical integration of (1 1-95) with r\ - 0, again applying the 
secant method, but varying the level U until the preassigned false-alarm probability 
Qo is reached. At this stage it is unnecessary to alter the intersection z of the 
path of integration with the real z-axis. We used a parabolic path of integration, 
determining its curvature as described in Sec. 5.2.2. 

H.3 The Toeplitz Approximation 

For time-bandwidth products m ~ \x.T greater than about 12, it sufficed for the sake 
of plotting the curves in Fig. 11-2 to use the Toeplitz approximation described in 
Sees. 11.2.5 and 11.2.6. Substituting (1 1-47) into (I I-I05), we find 

1 + zAr'*,«o) = K~ 
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with t/ , 

and by (11-107) the Frcdholm determinant is approximately 



£>(_-}« expm[Vi+tfr - 1], (H-16) 

which corresponds to (lie limiting form of (11- 1) when m » 1 and wc approximate 
g by 1 everywhere except in exp mg. Now the moment-generating function of the 
optimum detection statistic under hypothesis H\ is, by (1 1-73) or ( 1 1-96). with r\ - i, 

/iit = ) = £[/>i(t'')] = 



- exp/»[i ~ vnr^]. 

Here £ stands for the Laplace transform. 

The random variable V = V/K then has the probability density function 

T\(V) ~ r 1 [exp^O - Vl + r)]. 

Recognizing that for any function F{z), 

£-'F(l + = c'- r r'/-'(_-), 

and using [Erd54. cq. 5.6(1), p. 245]. we find for the probability density function of 
V under hypothesis H\ 

r~l/-„,I/--V2 



and the random variable 



= LL = ~L s = — 



has the probability density function - 

Flix) * l/^- V " V2eXp ["J7 Cv " i)2 ] (H " i?) 

in this Tocplitz approximation. 

From (1 1-31) through (1 1-34) wc find the likelihood ratio 

1 ' MV) Ai(-v) 0(D 

so that 

A,(.v) = Z)(])e- Vv />,(.v). 

Now DO) = cxpm[(l + #) 1/;! - i] in this approximation, by (H-16), and substi- 
tuting (H-17) wc find after a little algebra that under hypothesis //« the probability 
density function of .v = U/S is 

A»(.v) « ^.v~ V2 cxp^(.vVTTY - ])-j. (H-18) 
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Both (H-17) and (H-18) can be integrated in closed form to obtain the false- 
alarm and detection probabilities 

Qo * erf {(^) 1/2 W " 1)1 " erf {(^)' /2 W + ol (H-19) 



M = mVl + xi ~ jc V1 + K, 




Qds = erfc^) 1/2 (x - l)j - e 2m erfc^^)^ (*> + 1)], ' (H-20) 

in which x = C/o/5 1 - 

Given the false-alarm probability Q , one calculates xq from (H-19) by New- 
ton's method and then substitutes * = *o(l + K)~ m into (H-20) to determine the 
probability Q ds = Q d (\) of detecting the standard signal. 
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Appendix I 



Derivation of the 
Kalman-Bucy Equations 



In order to derive (11-175) we write (11-174) as 

■T 

z<Mt, 0C + (0 = Mt, w)<t>r(«, t)du, (1-1) 



(1-2) 



where by (11-166) 

(j>-(/, w) = 7VB(/ - i/) + r<|),(<, «) 

- AT8(/ -w) + £C<|> V (/,«)C + . 
Differentiating (1 1-165) and using (1 1-160), we obtain 

d ^2!l = m ^. {l ,ul u<t. 0-3) 

0! 

Differentiating (1-1) with respect to t and using (1-3), we find 
= d±*pj) c + {[) = rF{T)< t>,(T ; ,)C + (0 

= f T ^iJfl^ t) dl( + k.(T, TH : (T, t) 

Jo 3t 

= f 3Mr 1 «) <)) _ (Hi ;) ^ + _ k=(T) T)C( |>. V (T, /)C + , T > /, 

Jo 3t 
by (1-2). Hence, using (1-1) again, 

f ^zpJ^l^u, t) du = 1 [F(T) - k-(T, t)C] 4>. v (t, 0C + 

Jo 

= f [F(t) - k:(T, t)C] Mt, k)<J> = ("> 
Jo 
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or 



1 [ d^aT ^ ' F( *( T ' ") + k -"( T - ")]<>.-(«, t) du = 0. (1-4) 

This equation implies that the contents of the brackets must vanish for all u, that is, 
dk-( T , u) 

~ g T = F(T)k r (T, U) - k,(T, T)Ck-(T, «), T > a, (1-5) 

which corresponds to (11-175). 

In order to justify this step, replace any component of the vector in the brackets 
m (1-4) by/*(w) and use (1-2), whereupon (1-4) becomes 

\r(u)[m(u - t) + z$ s {u, /)] du = 0, 

* fW^w, t)du = -Nf*(t). 
Jo 

Multiply by /(r) dt and integrate. Then 

* [ [ /*(«)*.(", 0/(0 du dt = -TV f V(0I 2 dt. 

J0 J0 Jq 

Because <M«, O is a positive-definite kernel, this equation can hold only if z is real 
and negative or if /*(«) = 0. We exclude the former possibility. Equation (1-5) 
will then be valid everywhere in the complex z -plane except possibly for points on 
the negative Re z-axis. By analytical continuation, however, our final result will 
hold there as well, except at the points z = -1/\ A , where the functions involved are 
singular, the \ k being the eigenvalues of (1 1-74). 

Now we seek the differential equation satisfied by P,(0 as defined by (1 1-178) 
Recall that by (11-161) and (11-165) <K(r, = L(0, which obeys the differential 
equation (11-163). Differentiating (11-178) with respect to time t and using (1-3) 
(1-5), and (11-163), we obtain 

= z f, ~ zMt > t > cl < t > - z [ ^#^<M", t) du 

Jo OT 

= z(FL + LF + + QGG + ) - Z k,( T , t)CL(t) - z f[Fk z (T, u) 

JO 

- Mt, T)Ck r (T, u)]C^ x (u, t) du - z f k,(T, «)C<M«, t)F + (t) du. 

JO 

Substituting from (1 1-178) and (1 1-177), we obtain 

^ = FP, + P z r + zQGG + - zk 2 (r, t)c[l(t) - J\-(t, «)C«M«, t) du] 
= FP 2 + P Z F + + zQGG + - N~ ] P Z C + CP Z , 
which is the Riccati equation (11-179). 
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Appendix J 



Moment-generating 
Function: Coherent Plus 
Stochastic Signal 

When the stochastic signal in Sec. 11.5 is stationary, the equation (11-199) becomes 

*</) = *> + If Chit - n)G(«; z) o < / < r, (J-0 

•A* Jo 

and it can be solved by the techniques of Appendix A. We identify s(t) of (A-l) with 
R{t) and q(u) with Q{u\ z), and we take as the kernel of that integral equation 

<f)(? ~ w) = h(t -u)+ -(>,(/ - u). 

The Fourier transform of this kernel is 

*(&>) = 1 + ~ 

and it can be expressed as in (A-2) with m = n. The p/s are the roots of 

1 + ^4>*Hp) = 0. = Pi, Pa., 

as in (11-106). The In linear simultaneous equations (A- 14) now become 

! i 

where by (A-15) the In x In matrix G = G(T) is the same as in (11-143). The n 
elements of the column vectors R and L are given as in (A- 10) by 

U = e^Q«,(u;z)du, R k = e^^Q^u; z) du (J-2) 

J— 00 ^5" 
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(J-4) 



in terms of the solution of (A-8), which we write as 

2 f°° 

*(0 - ; + ^ J^^c/ - «)&»(«; 2) rf Mj o < / < r. (j-3) 

When as in (1 1-200) the signal envelope R(t) is a sum of sine waves, the solution 
Q<o(t; z) is 

Q*(t;-z) = £ rf " ■ exp /w m /, 
m #(w w ;2) 

ff(w;2) = 1 + ~<S> s {w\ 
as can be seen by substitution. Then we find from (J-2) 

m « 2} 
- V r m . ^ 

~ ^ utu, • »v : — ^ exp ,w « r ' 

m W(w m ;2)(iXfc - w M ) 

m «(W m ; Z) J-co 



(J-5) 



(J-6) 



when we put (x* + „ = 1 < & < «, as before. 
With 

Q(tl z) = Q M (t; 2) + £ cj exp fyt 

as in (A- 5) and (A-6), we can write J{z) in (1 1-198) as 

J(z) = Ji+ j 2i 

with 

1 C T 

Ji= 2N) **(0g»(/;*)*, (j. 7 ) 

J2 ~ 2N^ CJ \ R * (t) ex P & du (J-8) 
From (11-200) and (J-4) we find 

h m 2) Jo 

which yields (1 1-201) immediately. 
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Now with (11-200), (J-8) becomes 



/ 2 = ilX r*cj fexp(0, - m m )t dt 
__Lvv * r exp(fr-/vv,,,)r-l ] 

1 2 " 



^^1 

with the Ej as in (11-203). Now defining 

F k = R k , \ <k < n, F k - -Lk-„ t n + \ < k < 2n, 
as in (H-204), we must by (A-14) solve the In equations 

Gc = F, 

or c = G~'F, and we can write 

taking E as a row vector of the E/s and F as a column vector of the F k 's. By the 
rule for the determinant of a block matrix, 



det [ C D ] = det ° dGt(A ~ BD C) ' 



with A = 0, B = E, D = G, C = F, and using the fact that EG 'F is a matrix with 
a single element and hence equal to its determinant, we obtain (11 -202). 
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general properties, 379 
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Hermite signals, 384 
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pulse train, 386 

reduced, 251 

restrictions, 380 

self-transform property, 380 

single pulse, 381 
Ambiguity matrix, 363 
Ambiguity surface, 380 
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Amplitude: 



standard, 126 

unknown, detection, 125-7 
Amplitude modulation, 89 
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Antenna, effective area, 73 
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estimation, 233 

mean-square error, 235 
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stationary noise, 30 
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Autocovariance matrix, complex, 100 
Autoregressive moving-average 

process, 459, 462 
Avalanche photodiode, 491 
Average power, stochastic process, 466 
Average sample number, sequential 
test, 344 



B 

Balanced channel: 

binary, 116 

JW-ary, 133 

Rayleigh fading, 134 
Bandwidth: 

effective, 270 

root-mean-square (rms), 235, 280 
Barker sequence, 390 
Bayes cost, 122 

in estimation, 228 
Bayes criterion, 46 

in estimation, 227 
Bayes strategy, 7 

binary decisions, 8-9 

sequential detection, 337 

signal resolution, 358 
Bayes's rule, 3-5, 7 
Beam pattern, 160 
Beamforming, 137, 156-64 

point source of noise, 161-4 
Bessel function, modified, 108 

asymptotic form, 112 

trigonometric integral, 108, 527 
Beta distribution, 307 
Bias, 222 

Bilateral exponential distribution, 

321, 324 
Binary channel, 114 
incoherent- 
balanced, 116 
unilateral, 117 



Binary communication system, 

multiple signals, 137 
Binary decisions, 8 
Binary integration, 176 

detection probability, 179 
Blind velocities, 378 
Boltzmann distribution, 70 
Boltzmann's constant, 69, 481 
Bose-Einstein distribution, 478 
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Campbell's theorem, 36, 499 
Carrier, 89 

Carrier frequency, unknown, 282-6, 
295 

least favorable distribution, 300 
Cauchy distribution, 24 
Cauchy-Schwarz inequality, 50 

traces, 412 
Causal quasi-estimate, 455, 480 
Censored mean-level detector, 310 
Characteristic function, 189 

circular Gaussian, 103 
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Chebyshev inequality, 245 
Chernoff bound, 21 1-3 
Chirp modulation, 382 
Chi-squared distribution, 18, 428 
Circular Gaussian density function, 
102 

Circular Gaussian processes, 100-3 

characteristic function, 103 

joint moments, 103 
Clutter, 356, 395 

detection of pulse trains in, 377 

detection of single pulses in 
370-7 

overspread, 376 

spectral density, 370 

underspread, 376 
Cofunction, 40 
Coherence, complete, 138-40 
Coherent light, detection in 

background light, 482-4 
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Coherent signal plus stochastic signal, 

detection, 451-6 
Col, 198 

Colored noise, 37, 42 
Comb filter, 260 
Comb function, 260, 377, 387 
Completeness relation, 32 
Complex envelope, 90 
Composite hypothesis (see 

Hypothesis, composite) 
Concentration ellipsoid, 242 
Conditional expected value, 229 

stochastic signal, 445 
Conditional probability, 3 
Conditional risk, estimator, 228 
Constant-false-alarm rate (CFAR) 
receiver, 306, 316 

detection probability, 308 
graphs, 310-1 
Convex function, 197, 289 
Correlation coefficient, 29 

sample, 331 

signals, 117 
Correlation receiver, 45 
Cost, average, composite hypotheses, 
121 

distributed detection, 180 

minimization, 6-8 
Cost function, 227 

quadratic, 228 
Cost-likelihood ratio, 121 
Cost matrix, 7 
Counting functional, 276 
Covariance, time and frequency 

estimates, 257 
Covariance matrix, 29 

state vector, 459 
Cramer-Rao inequality, 239-42 
Cross-ambiguity function, 392 
Crossing rate, 276 

Gaussian process, 277 

rectified Gaussian process, 
280 

Cumulants, 190 
Curvature, 204 
Cutoff frequency, 372 



D 

Decision level, 9 
approximate computation: 
Bayes, 217 

Neyman-Pearson, 215 
random: 
detection probability, 538-40 
false-alarm probability, 540 
Decision surface, 9 

Neyman-Pearson criterion, 10 
Decision theory: 
binary, 8 
references, 8 
Deflection, 168 
Delta function, 32, 35 
derivative, 62, 512 
periodic, 366 
Detection: 
maximum-likelihood (see 

Maximum-likelihood detection) 
nonparametric, 315 
probability of, 9 
Detection probability: 
average, 124 
known signal, 44 

graph, 47 
randomized strategy, 21 
Detection statistic, stochastic signal, 
402 

Detector, photoelectric, 471 
Differentiation, functions of 

functions, 489 
Diffraction pattern, 388 
Dirac delta function, 32, 35 (see also 

Delta function) 
Discrete random variables, decisions 

based on, 18-23 
Discrete-time processing: 
known signal, 58-9 
stochastic signal, 456 
Discriminator, 92 
Distributed detection, 137, 

174-85 

Distribution-free receiver, 316 
Diversity communications, 137 
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Doppler shift, 250, 544 
effect on detection in clutter, 376 
378 

Dummy hypothesis, 4 
Duration, mean-square, 254 
Durbin algorithm, 460, 464 
Dynode, 486 



E 

Eckart filter, 406 

Edgeworth series, 194 

Effective signal-to-noise ratio (see 

Signal-to-noise ratio, effective) 
Efficacy, 168 
threshold detector, 297 
• Efficient estimator, 240 
asymptotic, 241 
Eigenfunctions, 37 

orthonormality, 38 
Eigenvalues: 
homogeneous integral equations, 

37,516 
sum of, 81 
Energy detector, 268 
Energy of signal, 47, 69 
narrowband signal, 92 
Epoch, 220 

Equal-eigenvalue approximation, 
429 

Equipollent receivers, 167 
Equipotentials, 202 
Error: 

first and second kinds, 9 

probability of, 9 
Error covariance matrix, 227, 231, 239 

signal parameters, 257 
Error function, 191 
Error-function integral, 6 

asymptotic form, 85 
Estimanda, 221 

Estimation, signal parameters, 242 
Estimator, 221 
arrival time, narrowband signal, 
246-9 



Bayes, 227-8 
efficient, 240 

asymptotic, 241 
jointly Gaussian parameters, 226 
linear, 227, 230 

maximum-a-posteriori-probability 
223 

maximum-likelihood, 223^, 232 

signal amplitude, 232 

variance, 236 
mean of Gaussian distribution, 
224-6 

minimum-mean-square-error, 228 
signal arrival time, 233 
stochastic signal, causal, 429 
unbiased, 222. 

Estimator-correlator interpretation, 
410 

Estimator-correlator representation, 
causal, 438 

Expected value, conditional, 229 

Exponential distribution, 303 

bilateral, 321, 324 

discrete, 476 



F 

^distribution, 541 
Factorial moments, 208 
Fading (see also Swerling cases) 
balanced channel, 134 
distributed detection, 176, 184 
maximum-likelihood receiver, 150 
multiple signals, 142 

detection probability, 147-51 
Rayleigh, 126, 152 
Swerling cases, 152 
Fading signals, detection probability, 
537-8 

False alarm, probability of, 9 
known signal in Gaussian noise, 44 
randomized strategy, 21 
resolution: 

multiple signals, 365 

two signals, 360 
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stochastic signals, 417 
threshold detector, 418 
False-alarm number, 154 
False-alarm rate, 275 (see also 

Crossing rate) 
False dismissal, 9 
Filter, narrowband, 93 
Filter bank: 

outputs, 285 

signals in clutter, 378 
First-passage-time probability density 

function, 274 
Fisher formula, estimation error, 237 

generalized, 238 
Fisher information matrix, 238 

signal parameters, 243, 256 
Fixed-sample-size test, 336 
Fluctuating point target, 396 
Flux lines, 202 
Fourier series, 3 1 
Frank code, 393 

Fredholm determinant, 399, 416, 474, 
480 

finite, 462, 465 
Lorentz spectral density, 433 
rational spectral density, 436 
stochastic signal, 431-3 

in white noise, 416, 448 
Toeplitz approximation, 422 
rational spectral density, 423 
Fredholm integral equation, 

homogeneous, 37 (see also 
Homogeneous integral 
equations) 
Frequency deviation, mean-square, 

249, 254 
Frequency modulation, 89 
Fusion center, 174 



G 

Gain: 
antenna, 73 

avalanche photodiode, 492 
photo multiplier, 488 



Gain pattern, 160 

Gamma distribution, 146 

Gamma function, 18 

Gaussian approximation, 211 

Gaussian noise: 
bivariate density function, 29 
multivariate density function, 28 
univariate density function, 5, 29 

Gaussian pulse, ambiguity function, 
382 

Geometric distribution, 20, 476 
Gram-Charlier series, 191 
Gram-Schmidt orthogonalization 

procedure, 31, 33-4, 75 
Ground mapper, 357 



H 

Hermite polynomial, 191 
Hermite signals, 383 
Hermitian kernel, 37 
Hermitian matrix, 398 
Heterodyne receiver, 140 
optical, 497 
output signal-to-noise ratio, 501 
Hilbert space, 37 
Hilbert transform, 92 
Hohlraum, 69 
Homodyne detector, 90 
optical, 503, 
output signal-to-noise ratio, 
503 

Homogeneity, statistical, 165, 317, 
337 

Homogeneous integral equations, 37 

solution, 65-8, 516 
Hypergeometric distribution, 478 
Hyperspherical shell, volume, 17 
Hypothesis: 

alternative, 8 

composite, 22, 107, 120 

nonpar ametric, 315 

null, 8 

parametric, 315 
simple, 120 
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Neyman type A distribution, 490 
Noise: 
colored, 37 

Gaussian (see Gaussian noise) 
leucogenic, 61, 64, 442 
narrowband (see Narrowband noise) 
stationary, 30 
thermal, 27 

white (see White noise) 
Noise figure, 74 

Nonary communication system, 86 
Noncentral Rayleigh distribution (see 

Rayleigh-Rice distribution) 
Noncentrality parameter, 541 
Nonparametric receiver, 316 

asymptotically, 320 
Norm, 40 

Normal distribution (see Gaussian 

noise) 
Nyquist's law, 71 



O 

Observation interval, 2 

infinite, 49 
stochastic signals, 404 
On-off system, 85 
Operating characteristic, 12-4 

discrete data, 21 

known signal (graph), 48 

slope, 13, 26 
Orthogonal signals, 75, 79 
Orthogonality: 

principle of, 230-2, 433, 446 

random variables, 230 
Orthonormal functions, 31, 40 
Osculatory parabola, 205 



P 

Parameter space, 120 
Parameters: 
invariable, 165 
sequential detection, 338 



unknown, 172 
signal, 120 
estimation, 246 
Paraxial approximation, 159 
Periodic codes, 390 
Permanent, 508 
Phase, random, 107 
Phase difference, distribution, 133 
Phase modulation, 89 
Phased array, 159 
Phasor, 91 

Photocount distributions, 472 
Photodetector, output current, 494 
Photoeiectrons, 471 

emission times, distribution, 504 
Photomultiplier, 486 

single-stage, 490 
Photon counting, 472 
Photon energy, 471 
Planck law, 480 
Planck's constant, 70, 471 
Poisson distribution, 25, 209, 479 

probability-generating function, 
209, 473 

saddlepoint integration, 209 

secondary electrons, 489 
Poisson limit, 485 

photoelectron emission times, 505 
Poisson point process, 35, 97, 472, 
494, 504 

Polarity coincidence correlator, 332 
Polya distribution, 489 
Polyphase codes, 389 
Positive-definite kernel, 39 
Posterior probability, 3, 19 
Power of test, 9 
Principle of orthogonality (see 

Orthogonality, principle of) 
Prior probability, 3 

Prior probability density function, 120 
Probability: 
conditional, definition, 3 
of correct decision, 3 
multiple signals, 78 
orthogonal signals, 80 
simplex signals, 80 
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of detection (see Detection 

probability) 
of error, 3 

multiple signals, 78 
of false-alarm (see False-alarm 

probability) 
posterior, 3 
prior, 3 

Probability density function, 2 
nominal, 317 

Probability generating function, 207 
binomial distribution, 325 
output of avalanche diode, 492 
output of photomultiplier, 488 
photocount distribution, 473 
Poisson distribution, 209, 473 
Polya distribution, 489 - 
rank statistic, 327 
rank-sum statistic, 330 

Probability mass function, 19 

Probability of detection (see 
Detection probability) 

Proper values, 37 

Pulse-amplitude modulated 

communication, 5, 75, 85 
error probability, 6 

Pulse-compression radar, 383 

Pulse-position modulation, optical, 
509 

Pulse-repetition rate, 1 55 
Pulse train, 259 

ambiguity function, 386 

coded, 389 

detection in clutter, 377 

Q 

Q function, 113, 523 
asymptotic forms, 525 
Mth order, 146, 526 
computation by recurrence, 

528-30 
Edgeworth series, 294 
saddlepoint approximation, 214, 
216 



saddlepoint integration, 200, 205 
trigonometric integrals, 527 
trigonometric integral, 534 
Quadratic detector, comparison with 

linear, 171-2 
Quadratic form, moment-generating 

function, 535 
Quadratic programming, 299 
Quadratic rectifier, filtered output, 414 
Quadratic threshold detector, 143 

performance, 144-7 
Quadrature components, 90 

narrowband noise, 96 
Quantization, 176 
Quantum efficiency, 47 1 
Quasiharmonic signal, 91 
Quaternary communication system, 75 

R 

Radar: 

pulse-compression, 383 

range measurement, 249 

receiver, 272 
Radiometer, 268, 406 
Radon-Nikodym derivative, 48 
Rake radiometer, 407 
Random avalanche gain, 491 

moments, 493 
Randomization, 21 

rank test, 327 

sign test, 32 1 
Range bins, 156, 305 
Rank test, 326 

asymptotic relative efficiency, 545 
Rational spectral density, 61, 422, 511 

discrete data, 459 

incoherent light, 479 
Rayleigh distribution, 1 1 1 
Rayleigh fading (see Fading) 
Rayleigh-Rice distribution, 112, 523 

moments, 524, 526 
Rayleigh-Ritz method, 67 
Receiver: 

circuit model, 71-4 
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Receiver (cont'd) 
Neyman-Pearson, 318 
nonparametric, 316 
unbiased, 317 
with reference input, 330 
Receivers, equipollent, 167 
Rectangular spectral density: 
autocovariance function, 424 
incoherent light, 478 
Rectifier, 92 
linear, 92 
quadratic, 92 
Regularity domain, Laplace 

transforms, 189 
Relative efficiency, sign test, 322 
Relative frequency, 3 
Reliability, 167 
Repetition period, 259 
Reproducing-kernel Hilbert space, 

40-1, 57 
Resolution, 356 
many signals, 363-8 
signals in time, 365-8 
two signals, 357-62 
Resolvability of signals, 356 
Resonant circuit, 94 
Riccati equation, 447 
Rice-Nakagami distribution, 112 {see 

also Rayleigh-Rice distribution) 
Risk, 228 

conditional, 7, 228 
RKHS {see Reproducing-kernel 

Hilbert space) 
Robust detection, 317 



S 

Saddlepoint, 198 

locating, 199, 209 
Saddlepoint approximation, 213-5 
Saddlepoint integration, 199 

curved path, 205 

detectability of stochastic signal, 
420 

Lorentz spectral density, 549 



Sample mean, 16, 123, 225 
Samples, 166 
Sampling, 269 

by orthonorraal functions, 31-3, 397 
Scalar product, 33, 77, 397 

in RKHS, 40, 82 
Scatter-multipath communication, 97, 
395 

Schwarz inequality, 258 {see also 

Cauchy-Schwarz inequality) 
expectations, 170, 239 
Schweppe likelihood-ratio receiver 

449-50 
Secant method, 199 
Seismology, 396 

Self-transform property, ambiguity 

function, 380 
Semi-invariants, 190 
Sensors, distributed detection: 
identical, 175-80 
nonidentical, 180-5 
Sequential detection, 337 
signals of random phase, 347-50 
targets of unknown distance, 350-4 
Sequential probability ratio test, 339 
average sample number, 344 
detection probability, 342 
variance of number of tests, 346 
Shot noise, 493-7 

heterodyne detector, 500 
Sign test, 320 

asymptotic relative efficiency, 321 
Signal space, 74-8 
Signal-to-noise ratio, 44, 45 
effective, 82, 168, 296 
heterodyne detector, 501 
homodyne detector, 503 
likelihood-ratio detector, 170 
threshold detector, 170 
loss, narrowband signals (graph) 
114 

maximum attainable, 49, 54, 70 
narrowband signal, 106 
signal in white noise, 47 
Signal vectors, 75 
displacement, 79 
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Signals: 

. analytic, 9! 

antipodal, 85 

bandlimited, 373 

minimum detectable, 46, 73 

multiple coherent, 138-40 

multiple incoherent, 140-1 

narrowband, 91 

orthogonal, 75, 79 

qnasiharmonic, 91 

random phase, 107-9 
detection probability, 1 I 1 
sequential detection, 347 

simplex, 80 

stochastic, 394 

unknown amplitude, detection, 
125-7 

unknown sign, detection, 293 
Simplex, 80 
Singular detection, 48 

stochastic signals, 41 1 
Size of test, 9 

Sodium, wavelength of chief line in 

spectrum, 481 
Spectral density, 30 
bilateral, 35 
least favorable, 312 
Lorentz (see Lorentz spectral 

density) 
matrix, 444 
narrowband, 96 
rational, 61,422, 511 
discrete data, 459 
incoherent light, 479 
unilateral, 34 
Specular component, 451 
Spread-spectrum communication 

system, 128 
State equations, 441 
State-space model, generation of 

stochastic signal, 442 
State vector, 441 

covariance matrix, 443 
Steepest descent, 201 

tracing path of, 206 
Stochastic signal, 394 



detectability: 
Lorentz spectral density, 424 
rectangular spectral density, 
424 

in white noise, 415-8 
discrete data, 456 

detectability, 462-5 
optimum detector, 402 
plus coherent signal, detection, 
451 

threshold detector, 405, 418-20 
Strategy, 2 

minimax, 23 
Student's / statistic, 319 
Sufficient statistic, 14 
detection: 
known signal, 44-9 
signal of random phase, 109 
estimation, 225, 241 
Swelling cases, detector performance, 
153-6 

Symmetrical kernel, 37 



T 

/-test, 319 

Temperature, effective, of noise, 69 
Temporal coherence function, 471 
Temporal modes, 475 
Ternary communication system, 85 
Thermal noise, 27, 69 
Thevenin equivalent of receiver, 71 
Threshold approximation, validity, 
420 

Threshold detector: 
multiple inputs, 143 

performance, 144-7 
narrowband signals, 298 
noise of unknown strength, 305 
sequential test, 349 
signal of unknown arrival time, 268 
signal with unknown parameters, 
296 

stochastic signals, 405 
in white noise, 418-20 
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Threshold effect, 246 
Threshold statistic, 127 
Time-bandwidth product, 385 
Toeplitz approximation, 422, 477 
Toeplitz matrix, 365 
Trace, 399 

Transducer array, 137, 156 

beam pattern, 160 
Transfer function, 30 

matched filter, 52 

narrowband filter, 93 
Transition probability density 

function, 274 
Transmission line, 69 
Trapezoidal rule, 200 
Two-input system, 331 



U 

Uncertainty ellipse, 258, 382 
Uniform asymptotic expansion, 217 
Uniformly most powerful test, 123, 
126 

Union bound, 365 
Unit step function, 212 



V 

Vector-space representation, 397 
Velocity of target, measurement, 250 



W 

Weak-signal approximation, 127 

multiple signals, 142-4 
Weak-signal detector (see Threshold 

detector) 
White noise, 34-7 

autocovariance function, 35 

linear filtering, 36 

narrowband, 103-4 

spectral density, 34 
Whitening filter, 54 
Wilcoxon signed rank statistic, 326 
Wilcoxon two-sample test, 330 



Z 

Zero-crossing problem, 275 
Ziv-Zakai bound, 244-6 
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